What engineers usually see
- •OpenAI streaming endpoint stops sending chunks mid-response
- •Connection remains open but no data flows
- •Client application hangs waiting for completion
- •Unclear if this is a network issue or provider rate limit
Why this is hard to debug
OpenAI doesn't give you per-request diagnostics, so when a stream hangs, you're left wondering if it was rate limits, network jank, or something upstream. Your logs just show an open connection with no data coming through — good luck figuring out what actually happened.
Minimal repro
from openai import OpenAI
client = OpenAI(
api_key="YOUR_OPENAI_KEY",
base_url="https://aibadgr.com/v1"
)
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Explain AI"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")This request routes through AI Badgr and returns a stable request ID that links to an execution record.
Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.
What a per-request execution record makes visible
- Stream initialization time
- Number of chunks received before hang
- Last successful chunk timestamp
- Network timeouts or provider errors
- Token usage up to failure point
Run 1 request → get receipt
Change your base URL to https://aibadgr.com/v1 and run your request.
The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.
Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.
This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.