LLM streaming client hangs

What engineers usually see

  • Client application enters infinite wait state
  • Stream connection appears open
  • No data received and no error thrown
  • Application becomes unresponsive

Why this is hard to debug

Client-side hangs are invisible to backend monitoring. Without request IDs linking client behavior to provider execution, you can't diagnose if the issue is network, provider, or application logic.

Minimal repro

const openai = new OpenAI({
  apiKey: 'YOUR_OPENAI_KEY',
  baseURL: 'https://aibadgr.com/v1'
});

const stream = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{role: 'user', content: 'test'}],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

This request routes through AI Badgr and returns a stable request ID that links to an execution record.

Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.

What a per-request execution record makes visible

  • Server-side request status
  • Whether provider responded
  • Network connectivity timeline
  • Provider completion vs client hang timing
  • Stable request ID for client-to-provider correlation

Run 1 request → get receipt

Change your base URL to https://aibadgr.com/v1 and run your request.

The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.

Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.

This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.