Retry storm (OpenAI)

What engineers usually see

  • Failures trigger exponential retry backoff
  • Multiple requests queue up and retry simultaneously
  • Creates cascading rate limits
  • Costs spike without successful completions

Why this is hard to debug

Retry storms are emergent behavior that's invisible in per-request logs. You can't see the aggregate retry pattern or its cost impact. Receipts aggregate retry behavior across requests.

Minimal repro

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_OPENAI_KEY',
  baseURL: 'https://aibadgr.com/v1',
  maxRetries: 5
});

// This pattern can cause retry storms
await Promise.all([
  client.chat.completions.create({model: 'gpt-4o-mini', messages: [{role: 'user', content: '1'}]}),
  client.chat.completions.create({model: 'gpt-4o-mini', messages: [{role: 'user', content: '2'}]}),
  client.chat.completions.create({model: 'gpt-4o-mini', messages: [{role: 'user', content: '3'}]}),
]);

This request routes through AI Badgr and returns a stable request ID that links to an execution record.

Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.

What a per-request execution record makes visible

  • Total retry attempts across requests
  • Concurrent request load over time
  • Rate limit cascade timeline
  • Cost of failed retry attempts
  • Recommended concurrency limits

Run 1 request → get receipt

Change your base URL to https://aibadgr.com/v1 and run your request.

The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.

Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.

This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.