Checking status…

Replicate Outage Tracker & Status

Real-time Replicate API status. Live incident detection, developer mitigations, and a one-click failover option.

About Replicate API Status

Replicate provides on-demand inference for thousands of open-source AI models including Stable Diffusion, FLUX, Llama, Whisper, and custom fine-tuned models via a simple prediction API. This page tracks Replicate API outages, degradations, and incidents in real time, automatically updated every 60 seconds from our monitoring infrastructure.

Official status page: https://status.replicate.com

Common Replicate Outage Symptoms

  • HTTP 429 — rate limit exceeded or prediction queue at capacity
  • HTTP 503 — API temporarily unavailable or predictions timing out
  • Long cold-start latency when a model container needs to be provisioned
  • Prediction stuck in 'starting' or 'processing' state during capacity pressure
  • Webhook delivery failures during backend incidents
  • Specific model versions becoming unavailable during platform updates

What to Do During a Replicate Outage

  1. Honor the Retry-After header on 429 responses and back off before retrying prediction creation.
  2. Use the predictions.get polling endpoint with exponential backoff instead of relying solely on webhooks.
  3. Switch to a BYOK proxy (AI Badgr) to get per-request receipts and automatic retry handling.
  4. Monitor the official Replicate status page at status.replicate.com for incident announcements.
  5. Set a prediction timeout and cancel stuck predictions via the API rather than letting them queue indefinitely.

Other AI Provider Status Pages

Replicate Outage FAQ

Is Replicate down right now?

This page checks our live monitoring infrastructure (updated every 60 s) which tracks the official Replicate status page and our own request telemetry. The status badge at the top reflects the current state.

Why is my Replicate prediction stuck in 'starting' state?

A prediction stuck in 'starting' usually means the model container is cold and needs to be provisioned (cold start), or there is a capacity constraint. Cold starts can take 30–120 s. During incidents they can last longer. Poll the prediction endpoint and set a timeout.

Why is Replicate returning 429 errors?

HTTP 429 from the Replicate API means you have hit a rate limit or the prediction queue is at capacity. Back off and retry with exponential delay. During peak load, reduce concurrent prediction requests.

Can I automatically failover away from Replicate during an outage?

Yes. For image generation models, AI Badgr can failover to alternative image generation backends. For language models, it can route to other providers. Change one configuration line and get transparent receipts.

How do I handle Replicate webhook delivery failures?

During Replicate incidents, webhooks may be delayed or dropped. Always implement polling as a fallback: after creating a prediction, poll predictions.get on a schedule until the prediction reaches a terminal state (succeeded/failed/canceled).

Never get stuck in a Replicate outage again

AI Badgr acts as a transparent proxy for your existing API keys. One line of code change. Zero vendor lock-in. Instant failover when Replicate is down.

Get Started Free →