What engineers usually see

•OpenAI returns 429 with Retry-After header
•Client may or may not respect the delay
•Unclear if automatic retries will be attempted
•Cannot track cost of failed vs retried requests

Why this is hard to debug

Rate limit headers don't always get logged or even respected by clients. So you can't tell if your app actually waited the right amount of time, how many retries happened, or if you got charged for tokens. Pretty frustrating when you're trying to debug.

Minimal repro

curl https://aibadgr.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_OPENAI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "test"}]
  }'

This request routes through AI Badgr and returns a stable request ID that links to an execution record.

Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.

What a per-request execution record makes visible

Retry-After value from provider
Whether client respected the delay
Number of retry attempts
Total latency including retries
Cost per attempt (even 429s can incur charges)

Run 1 request → get receipt

Change your base URL to https://aibadgr.com/v1 and run your request.

The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.

Get API Key

Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.

This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.

429 rate limit with retry-after header

What engineers usually see

Why this is hard to debug

Minimal repro

What a per-request execution record makes visible

Run 1 request → get receipt