Streaming hangs with no chunks

What engineers usually see

  • Stream connection opens but no chunks arrive
  • Client waits indefinitely without timeout or error
  • No clear failure signal from the provider
  • May eventually timeout but unclear which stage failed

Why this is hard to debug

Streaming stuff is tricky because it messes up the lines between network problems, provider weirdness, and actual bugs in your code. Without a receipt for each request, you're basically guessing if the stream even started, where it died, or if you got charged for tokens.

Minimal repro

curl https://aibadgr.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_OPENAI_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "hello"}],
    "stream": true
  }'

This request routes through AI Badgr and returns a stable request ID that links to an execution record.

Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.

What a per-request execution record makes visible

  • First token latency (time until first chunk)
  • Total chunks received and bytes streamed
  • Exact point where stream stalled or completed
  • Whether tokens were consumed despite failure
  • Full timing breakdown (connection, TTFB, streaming)

Run 1 request → get receipt

Change your base URL to https://aibadgr.com/v1 and run your request.

The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.

Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.

This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.