Batch LLM cost spike

What engineers usually see

  • Batch job costs much more than expected
  • Cannot identify which requests in batch were expensive
  • Unclear if cost spike is normal or anomaly
  • Need to attribute cost to specific batch items

Why this is hard to debug

Batch processing aggregates many requests. You can't see per-item costs or identify outliers. Receipts provide per-request cost even in batch scenarios.

Minimal repro

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_OPENAI_KEY",
    base_url="https://aibadgr.com/v1"
)

# Batch process many items
for item in items:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": item}]
    )
    # Each request gets a receipt

This request routes through AI Badgr and returns a stable request ID that links to an execution record.

Note: AI Badgr is OpenAI-compatible and works as a drop-in proxy. No SDK changes required — only the base_url changes.

What a per-request execution record makes visible

  • Cost per batch item
  • Total batch cost
  • Cost outliers in batch
  • Batch cost distribution
  • Per-item cost trends

Run 1 request → get receipt

Change your base URL to https://aibadgr.com/v1 and run your request.

The response includes an X-Badgr-Request-Id header that links to a receipt showing latency, retries, tokens, cost, and failure stage for that specific execution.

Not the engineer?
Share this page with your dev and ask them to run one request through AI Badgr. That's all that's needed to get the receipt.

This kind of thing only makes sense when you can actually see what happened to a single request from start to finish, instead of trying to piece it together from scattered logs.