One request, an ordered list
Set your primary in the model field and one or more backups in the models array. FastRouter tries them top to bottom until one succeeds.
List a primary model and ordered fallbacks in a single request. If a model is rate-limited, down, or returns an error, FastRouter automatically retries the next candidate-so a single failure never drops the request.
No credit card required · Free to start
openai/gpt-4o
Rate limited
openai/o1
Provider unavailable
google/gemini-1.5-pro
Returned a completion
Turn a single point of failure into an ordered list of backups. No client retry loops, no custom failover code-just a models array on the request you already send.
Set your primary in the model field and one or more backups in the models array. FastRouter tries them top to bottom until one succeeds.
When a model is rate-limited, unavailable, or returns an error, FastRouter moves to the next candidate for you-no client-side retry logic to maintain.
You are billed only for the model that actually answers, and the response reports which one ran-so failover never hides your real usage.
FastRouter sends your request to the primary model first. On failure it walks the fallback list in order until a model returns a successful response-or every candidate has been tried.
Attempt 1
openai/gpt-4o
Attempt 2
openai/o1
Attempt 3
google/gemini-1.5-pro
200 OK · request succeeded
"model": "google/gemini-1.5-pro"
Add a models array next to your existing model field and the same call now survives rate limits, downtime, and moderation. The final model is returned in the response, and you are billed only for the model that actually answered.
Send your usual chat completions request, then add a models array of backups. The primary in model is tried first, and the array is your ordered safety net.
The model field sets the primary; the models array lists one or more fallbacks, tried in the order you write them.
FastRouter iterates the list from top to bottom until a model returns a successful response or every candidate fails.
List backups across different models and providers so a single point of failure becomes a chain of reliable alternatives.
Fallback list
Routing order
Tried top to bottomFallback kicks in whenever the current model is unavailable or returns an error-so transient problems become a retry instead of a dropped request.
Hit a rate limit at peak traffic and the request rolls over to the next model instead of failing.
If a model or provider is down or unreachable, FastRouter keeps the request moving through your list.
Moderation blocks and other errors trigger the next candidate rather than returning an error to your app.
Failover triggers
openai/gpt-4o
Primary attempt failed
Falls back when a model
Failover never leaves you guessing. The response tells you exactly which model ran, and billing follows whichever model actually processed the request.
The model field of the response body reports the model that ultimately produced the completion.
Billing is based on the model that actually processes the request-failed candidates are never charged.
If the primary and every fallback fail, FastRouter returns the final error-so you know the whole list was exhausted.
openai/gpt-4o · openai/o1 not charged
A single model has nowhere to go when it fails. A fallback list keeps the very same request alive across rate limits, downtime, and moderation.
| Behavior | Single modelmodel only | Fallback listmodel + models |
|---|---|---|
| When a model fails | ||
| Retries the next candidate | Not included | Included |
| Survives rate limits | Not included | Included |
| Survives downtime & unavailability | Not included | Included |
| Recovers from moderation blocks | Not included | Included |
| Results & cost | ||
| Reports the model that answered | Included | Included |
| Billed only for the model that ran | Included | Included |
| Tries every candidate before erroring | Not included | Included |
Add resilience by passing a models array alongside model-no other change to your request.
Wherever an outage or a limit would normally drop a request, an ordered fallback list keeps it moving.
Roll over to a backup when your primary model is rate-limited or down, so a single failure never reaches your users.
List candidates across different providers so one provider's incident cannot take your whole app offline.
When burst traffic trips a model's rate limit, requests continue on the next candidate automatically.
Put your first-choice model up front and alternates behind it-you are billed only for the one that answers.
Pass your primary model in the model field and one or more fallbacks in the models array on the same chat completions request. FastRouter tries the primary first, then each model in the array in order until one returns a successful response.
FastRouter falls back when the current model is unavailable or returns an error-for example due to rate limits, downtime, or moderation. Instead of failing the request, it moves on to the next candidate in your list.
Strictly in the order you list them. FastRouter attempts the model value first, then walks the models array from top to bottom until a model succeeds or every candidate has failed.
Billing is based on the model that actually processes the request. If your primary fails and a fallback answers, you are charged for the fallback that ran-not the candidates that failed.
The final model used is returned in the model field of the response body, so you can always see which candidate produced the response.
If the primary and all fallbacks fail, FastRouter returns the final error to you. Listing additional candidates-ideally across different providers-reduces the chance of reaching that point.
Pass a models array on your existing chat completions request and let FastRouter retry the next candidate whenever one fails.