What engineers usually see
- •Claude API streaming starts normally
- •Stream stalls during content generation
- •No error message or completion event
- •Connection remains open but idle
Why this is hard to debug
Anthropic's streaming doesn't expose internal retry logic or rate limit signals. When streams stall, you can't tell if it's a temporary hiccup, rate limit backoff, or permanent failure without per-request receipts.
Minimal repro
curl https://aibadgr.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_ANTHROPIC_KEY" \
-H "X-Provider: anthropic" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"messages": [{"role": "user", "content": "Explain AI"}],
"stream": true,
"max_tokens": 1000
}'This request routes through AI Badgr and returns a stable request ID that links to an execution record.
Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.
What a per-request execution record makes visible
- Stream health timeline
- Gaps between chunks (stall detection)
- Provider-side rate limit signals
- Retry attempts (if any)
- Final stream status
Run 1 request → get receipt
Change your base URL to https://aibadgr.com/v1 and run your request.
The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.
Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.
This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.