Anthropic 429 retries

What engineers usually see

  • Claude API returns 429 rate limit
  • Anthropic uses different rate limit windows
  • Retry behavior may not align with provider reset times
  • Cannot track cost of rate-limited requests

Why this is hard to debug

Anthropic's rate limits work differently than OpenAI. Standard retry logic may not match their reset windows. Receipts capture provider-specific rate limit behavior.

Minimal repro

curl https://aibadgr.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_ANTHROPIC_KEY" \
  -H "X-Provider: anthropic" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "messages": [{"role": "user", "content": "test"}],
    "max_tokens": 100
  }'

This request routes through AI Badgr and returns a stable request ID that links to an execution record.

Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.

What a per-request execution record makes visible

  • Anthropic-specific rate limit type
  • Rate limit window and reset time
  • Retry attempts and timing
  • Whether retries aligned with reset
  • Cost per attempt

Run 1 request → get receipt

Change your base URL to https://aibadgr.com/v1 and run your request.

The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.

Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.

This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.