One model, many providers
Popular models are served by several providers at once. FastRouter routes each request to the provider that best fits your goal-no separate integrations per provider.
For models offered by multiple providers, FastRouter picks which provider serves each request-by lowest price, lowest latency, or highest throughput-with explicit ordering, filtering, and automatic failover built in.
No credit card required · Free to start
openai/gpt-4o
Offered by 3 providers
Azure
azure
$2.50 /1M
540 ms
96 tok/s
Together
together
$3.20 /1M
620 ms
124 tok/s
OpenAI
openai
$5.00 /1M
410 ms
88 tok/s
Stop hardcoding a single provider. Set a strategy once and FastRouter selects the best provider for every request-on cost, speed, or scale.
Popular models are served by several providers at once. FastRouter routes each request to the provider that best fits your goal-no separate integrations per provider.
Sort providers automatically by price, latency, or throughput-or rely on the balanced default that favors low cost and high uptime.
Set an exact provider order, restrict with only, exclude with ignore, and toggle fallbacks-all from the provider object on a single call.
Send a chat completion for a multi-provider model. FastRouter filters the candidates, ranks them by your order or sort strategy, and serves the request from the top provider-falling back automatically if it is down.
Request
FastRouter
Provider
Append a model suffix and FastRouter applies the matching strategy automatically-no provider object required. Send nothing at all and the balanced default strategy takes over.
Set provider.sort to rank every provider that serves your model. Leave it unset and FastRouter applies its default strategy, which favors providers that meet a performance and uptime threshold and weights them by the inverse square of price.
sort: "price" routes to the least expensive provider for the selected model.
sort: "latency" routes to the fastest responding provider.
sort: "throughput" prefers the provider with the highest tokens-per-minute output.
Routing strategy
Default · balanced
Low cost & high uptime
Lowest price
sort: "price"
Lowest latency
sort: "latency"
Highest throughput
sort: "throughput"
Use order to define the precise sequence of providers to attempt. Providers are tried in that order, and allow_fallbacks decides whether the request can move on to the next one.
order takes an array of provider slugs and attempts them in exactly that sequence.
Set allow_fallbacks: true to use backup providers when the primary is unavailable.
Each provider is tried in turn, so a single request flows down the list until one succeeds.
Provider order
Narrow the candidate pool before any ordering or sorting runs. only and ignore decide which providers are eligible, while require_parameters keeps requests on providers that support every parameter you send.
Restrict the request to a specific set of providers; order and sort apply only to that subset.
Exclude listed providers from the request regardless of ordering or sorting.
Set to true so only providers that support all parameters in your request are considered.
Every strategy is set on the provider object and ranks the providers that serve your model. Choose the one that matches what each workload needs to optimize.
| Strategy | DefaultNo config needed | Lowest pricesort: price | Lowest latencysort: latency | Highest throughputsort: throughput |
|---|---|---|---|---|
| Optimizes for | ||||
| Lowest cost per token | Included | Included | Not included | Not included |
| Fastest response time | Not included | Not included | Included | Not included |
| Highest tokens per minute | Not included | Not included | Not included | Included |
| Performance & uptime threshold | Included | Not included | Not included | Not included |
| Configuration | ||||
| Model suffix shortcut | Not included | :price | Not included | :throughput |
| Combine with order / only / ignore | Included | Included | Included | Included |
| Best for | ||||
| Cost-sensitive workloads | Included | Included | Not included | Not included |
| Latency-critical, interactive apps | Not included | Not included | Included | Not included |
| High-volume or batch generation | Not included | Not included | Not included | Included |
Strategies are set on the provider object (or via the :price and :throughput model suffixes) and combine with order, only, ignore, and allow_fallbacks.
From cost control to latency-sensitive UX and high-volume jobs, provider routing lets each request optimize for what matters most.
Always land on the cheapest provider for a given model with sort: "price" or the :price suffix-no code changes per provider.
Route latency-sensitive chat and agent calls to the fastest provider with sort: "latency" for snappy responses.
Prefer the highest-throughput provider with sort: "throughput" (or :throughput) so bulk generation finishes sooner.
Keep sensitive traffic on approved providers with only and ignore, and add allow_fallbacks so requests stay resilient.
Many models are offered by more than one provider. Provider routing decides which provider serves each request, using the provider object in your chat completions call-order, only, ignore, sort, allow_fallbacks, and require_parameters-so you control cost, speed, and reliability without changing application code.
There are four. The default strategy favors providers that meet a performance and uptime threshold and weights them by the inverse square of price for cost-effective, reliable routing. Lowest Price (sort: "price") chooses the least expensive provider, Lowest Latency (sort: "latency") chooses the fastest responding provider, and Highest Throughput (sort: "throughput") prefers the highest tokens-per-minute provider.
Add a provider object to your request and set sort to "price", "latency", or "throughput". You can also use model suffixes-:price and :throughput-as shortcuts that trigger sorting without setting provider.sort, for example openai/gpt-4o:price.
Use provider.order with an array of provider slugs, such as ["azure", "openai"]. Providers are attempted in exactly that sequence. Set allow_fallbacks: true so the request can fall through to the next provider in the list when the primary is unavailable.
Yes. only restricts the request to a specific set of providers, and any order or sort logic applies just to that subset. ignore excludes listed providers regardless of ordering or sorting. You can also set require_parameters: true to limit routing to providers that support every parameter in your request.
If you send no provider object, FastRouter applies its default strategy: it gives higher priority to models and providers that meet a performance and uptime threshold and weights them by the inverse square of their price. This selects cost-effective options without compromising reliability-no configuration required.
Add a provider object to your chat completions call and FastRouter handles the rest-sorting by price, latency, or throughput with ordering, filtering, and fallbacks built in.