Async LLM job failed

What engineers usually see

  • Background LLM job fails without clear error
  • Job was queued and started but didn't complete
  • No visibility into what happened during execution
  • Cannot reproduce failure in development

Why this is hard to debug

Async jobs run outside request context. Standard logging doesn't capture async execution details. Receipts work the same for sync and async requests.

Minimal repro

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_OPENAI_KEY",
    base_url="https://aibadgr.com/v1"
)

# Async job calls LLM
def process_job(job_id):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "process..."}],
        extra_headers={"X-Job-ID": job_id}
    )
    return response

This request routes through AI Badgr and returns a stable request ID that links to an execution record.

Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.

What a per-request execution record makes visible

  • Async job request details
  • Execution timing
  • Failure reason if job failed
  • Cost per async job
  • Job correlation ID

Run 1 request → get receipt

Change your base URL to https://aibadgr.com/v1 and run your request.

The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.

Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.

This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.