Provider Routing

Route every model to its best provider

For models offered by multiple providers, FastRouter picks which provider serves each request-by lowest price, lowest latency, or highest throughput-with explicit ordering, filtering, and automatic failover built in.

No credit card required · Free to start

Provider routing

openai/gpt-4o

Offered by 3 providers

sort: price
AZ

Azure

azure

Selected
Price

$2.50 /1M

Latency

540 ms

Output

96 tok/s

TG

Together

together

Rank #2
Price

$3.20 /1M

Latency

620 ms

Output

124 tok/s

OA

OpenAI

openai

Rank #3
Price

$5.00 /1M

Latency

410 ms

Output

88 tok/s

Why provider routing

The same model, routed to the right provider

Stop hardcoding a single provider. Set a strategy once and FastRouter selects the best provider for every request-on cost, speed, or scale.

One model, many providers

Popular models are served by several providers at once. FastRouter routes each request to the provider that best fits your goal-no separate integrations per provider.

Strategies for cost, speed, or scale

Sort providers automatically by price, latency, or throughput-or rely on the balanced default that favors low cost and high uptime.

Precise control per request

Set an exact provider order, restrict with only, exclude with ignore, and toggle fallbacks-all from the provider object on a single call.

How it works

From one request to the best provider

Send a chat completion for a multi-provider model. FastRouter filters the candidates, ranks them by your order or sort strategy, and serves the request from the top provider-falling back automatically if it is down.

Request

Model + provider object

openai/gpt-4oprovider
  • Send a chat completion for a model offered by multiple providers.
  • Attach a provider object or a :price / :throughput model suffix.

FastRouter

Provider routing

only / ignoreordersort
  • Filters candidates with only and ignore, then ranks by order or sort.
  • Honors require_parameters so unsupported providers are skipped.

Provider

Best-matched provider

Lowest priceLowest latencyHighest throughput
  • Serves the request from the top-ranked provider for your strategy.
  • allow_fallbacks retries the next provider when the primary is down.
"model": "openai/gpt-4o:price"
"model": "openai/gpt-4o:throughput"

Sort without a provider object

Append a model suffix and FastRouter applies the matching strategy automatically-no provider object required. Send nothing at all and the balanced default strategy takes over.

  • :price and :throughput suffixes shortcut sort: "price" and sort: "throughput".
  • No strategy set? The default favors low cost and high uptime, weighted by the inverse square of price.
Routing strategies

Optimize for price, latency, or throughput

Set provider.sort to rank every provider that serves your model. Leave it unset and FastRouter applies its default strategy, which favors providers that meet a performance and uptime threshold and weights them by the inverse square of price.

Lowest price

sort: "price" routes to the least expensive provider for the selected model.

Lowest latency

sort: "latency" routes to the fastest responding provider.

Highest throughput

sort: "throughput" prefers the provider with the highest tokens-per-minute output.

Routing strategy

provider.sort

Default · balanced

Low cost & high uptime

Lowest price

sort: "price"

Active

Lowest latency

sort: "latency"

Highest throughput

sort: "throughput"

"provider": { "sort": "price" }
Ordering & fallbacks

Set an exact provider order with safe fallbacks

Use order to define the precise sequence of providers to attempt. Providers are tried in that order, and allow_fallbacks decides whether the request can move on to the next one.

Explicit order

order takes an array of provider slugs and attempts them in exactly that sequence.

Automatic fallbacks

Set allow_fallbacks: true to use backup providers when the primary is unavailable.

Sequential attempts

Each provider is tried in turn, so a single request flows down the list until one succeeds.

Provider order

order
  1. 1azure
    Primary
  2. 2openai
    Fallback
  3. 3together
    Fallback
allow_fallbackstrue
"order": ["azure", "openai"]
Filtering & constraints

Restrict, exclude, and enforce parameter support

Narrow the candidate pool before any ordering or sorting runs. only and ignore decide which providers are eligible, while require_parameters keeps requests on providers that support every parameter you send.

only

Restrict the request to a specific set of providers; order and sort apply only to that subset.

ignore

Exclude listed providers from the request regardless of ordering or sorting.

require_parameters

Set to true so only providers that support all parameters in your request are considered.

POST /api/v1/chat/completions
JSON
{
"model": "openai/gpt-4o",
"provider": {
"only": ["azure", "openai"],
"ignore": ["openai"],
"sort": "price",
"require_parameters": true
}
}
Routed → azureLowest-price match
Strategy comparison

Pick the routing strategy per request

Every strategy is set on the provider object and ranks the providers that serve your model. Choose the one that matches what each workload needs to optimize.

Comparison of FastRouter provider routing strategies by what they optimize and when to use them
StrategyDefaultNo config neededLowest pricesort: priceLowest latencysort: latencyHighest throughputsort: throughput
Optimizes for
Lowest cost per tokenIncludedIncludedNot includedNot included
Fastest response timeNot includedNot includedIncludedNot included
Highest tokens per minuteNot includedNot includedNot includedIncluded
Performance & uptime thresholdIncludedNot includedNot includedNot included
Configuration
Model suffix shortcutNot included:priceNot included:throughput
Combine with order / only / ignoreIncludedIncludedIncludedIncluded
Best for
Cost-sensitive workloadsIncludedIncludedNot includedNot included
Latency-critical, interactive appsNot includedNot includedIncludedNot included
High-volume or batch generationNot includedNot includedNot includedIncluded

Strategies are set on the provider object (or via the :price and :throughput model suffixes) and combine with order, only, ignore, and allow_fallbacks.

Use cases

Routing that adapts to every workload

From cost control to latency-sensitive UX and high-volume jobs, provider routing lets each request optimize for what matters most.

Cut spend on commodity models

Always land on the cheapest provider for a given model with sort: "price" or the :price suffix-no code changes per provider.

Keep interactive apps fast

Route latency-sensitive chat and agent calls to the fastest provider with sort: "latency" for snappy responses.

Push high-volume batch jobs

Prefer the highest-throughput provider with sort: "throughput" (or :throughput) so bulk generation finishes sooner.

Pin or exclude specific providers

Keep sensitive traffic on approved providers with only and ignore, and add allow_fallbacks so requests stay resilient.

FAQ

Provider routing questions, answered

Many models are offered by more than one provider. Provider routing decides which provider serves each request, using the provider object in your chat completions call-order, only, ignore, sort, allow_fallbacks, and require_parameters-so you control cost, speed, and reliability without changing application code.

There are four. The default strategy favors providers that meet a performance and uptime threshold and weights them by the inverse square of price for cost-effective, reliable routing. Lowest Price (sort: "price") chooses the least expensive provider, Lowest Latency (sort: "latency") chooses the fastest responding provider, and Highest Throughput (sort: "throughput") prefers the highest tokens-per-minute provider.

Add a provider object to your request and set sort to "price", "latency", or "throughput". You can also use model suffixes-:price and :throughput-as shortcuts that trigger sorting without setting provider.sort, for example openai/gpt-4o:price.

Use provider.order with an array of provider slugs, such as ["azure", "openai"]. Providers are attempted in exactly that sequence. Set allow_fallbacks: true so the request can fall through to the next provider in the list when the primary is unavailable.

Yes. only restricts the request to a specific set of providers, and any order or sort logic applies just to that subset. ignore excludes listed providers regardless of ordering or sorting. You can also set require_parameters: true to limit routing to providers that support every parameter in your request.

If you send no provider object, FastRouter applies its default strategy: it gives higher priority to models and providers that meet a performance and uptime threshold and weights them by the inverse square of their price. This selects cost-effective options without compromising reliability-no configuration required.

Route every request to its best provider

Add a provider object to your chat completions call and FastRouter handles the rest-sorting by price, latency, or throughput with ordering, filtering, and fallbacks built in.