LLM retries in background jobs

What engineers usually see

  • Background jobs retry LLM requests
  • No visibility into retry behavior
  • Cannot tell if retries eventually succeeded
  • Costs accumulate from background retries

Why this is hard to debug

Background retry logic is opaque. You can't see retry attempts without instrumenting the job queue. Receipts capture all retry attempts regardless of execution context.

Minimal repro

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_OPENAI_KEY",
    base_url="https://aibadgr.com/v1",
    max_retries=3
)

# Background job with retries
def background_job():
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "background"}]
    )
    return response

This request routes through AI Badgr and returns a stable request ID that links to an execution record.

Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.

What a per-request execution record makes visible

  • Background retry attempts
  • Retry timing and backoff
  • Cost per retry
  • Final retry outcome
  • Retry efficiency in background context

Run 1 request → get receipt

Change your base URL to https://aibadgr.com/v1 and run your request.

The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.

Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.

This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.