Back
Intelligent Routing: How FastRouter Cuts OpenClaw Costs by 80%

Intelligent Routing: How FastRouter Cuts OpenClaw Costs by 80%

How intelligent model routing works, what FastRouter’s auto mode actually does under the hood, and how to connect it in 10 minutes.

R
Ritesh Prasad
3 Min Read|Latest — April 16, 2026

Part 2 of 3: OpenClaw + FastRouter — Building AI Agents That Scale

April 2026 · FastRouter.ai Blog · 10 min read

If you haven’t read Part 1 yet, start there — it explains why AI agent economics broke overnight and what changed with the shift to always-on systems.

In this part 2 of the series, we focus on the solution: how intelligent model routing works, what FastRouter’s auto mode actually does under the hood, and how to connect the two in about 10 minutes.

Why agents are expensive (and where the waste is)

Before optimizing, it's worth understanding what drives OpenClaw's API costs in the first place.

Unlike a chatbot — where one human message produces one model response — an autonomous agent executes a loop. When OpenClaw wakes up (on heartbeat, on an incoming message, or on a cron schedule), a single "event" might trigger:

  1. A context pull: what's in memory, what tasks are pending, what channels have new input?
  2. A planning call: given the situation, what should I do next?
  3. One or more tool selection calls: which skill to invoke, with what parameters?
  4. Execution verification: did that work? Is the output correct?
  5. A response generation call: how should I communicate what happened?

That's 4–8 model calls per event, where a chatbot makes one. For a moderately active OpenClaw instance — 20–30 events per day between scheduled tasks and incoming messages — you're looking at 80–240 model calls daily. At Claude Sonnet 4.6 pricing ($3.00/$15.00 per million tokens), with average call sizes of 2,000 input tokens and 500 output tokens, that comes to roughly:

  • Input: 240 calls × 2,000 tokens = 480,000 tokens = $1.44/day
  • Output: 240 calls × 500 tokens = 120,000 tokens = $1.80/day
  • Total: ~$3.24/day, ~$97/month

That's for a single-agent, moderate-usage instance. Scale to a multi-agent setup with a stock monitoring agent, an email triage agent, and a research agent all running in parallel, and you're looking at $300–$600/month.

Here's what's important: the majority of those calls don't need Claude Sonnet. A heartbeat check asking "is there anything new I should act on?" is not a complex reasoning task. Summarizing an email thread for triage is not a task requiring frontier model capability. Formatting a research brief that a previous call already synthesized is not a job for the most expensive model available.

Roughly 70% of typical OpenClaw calls fall into categories where mid-tier or budget models perform at 90%+ parity with frontier models on output quality, at 5–60x lower cost. That's the optimization opportunity.

The model landscape: a practical guide

FastRouter gives you access to 134+ models. For OpenClaw workloads, the decision space narrows considerably once you match model capability to task type.

Tier 1: Heartbeats and routine checks

Recommended: GPT-5 nano, DeepSeek-V3.2

  • GPT-5 nano: $0.05/$0.40 per million tokens
  • DeepSeek-V3.2: $0.27/$0.40 per million tokens

These handle: scheduled wake-up checks, incoming message triage ("does this need a response?"), status polling, simple yes/no decisions. Fast, cheap, more than adequate. Routing your OpenClaw heartbeats here instead of Sonnet 4.6 saves 60x on those calls.

Tier 2: General assistant tasks

Recommended: Gemini 3 Flash, Mistral NeMo

  • Gemini 3 Flash: $0.50/$3.00 per million tokens
  • Mistral NeMo: $0.02/$0.04 per million tokens

These handle: email drafting, document summarization, research organization, calendar management, standard Q&A. Gemini 3 Flash is particularly strong for anything involving large documents — its 2M token context window is a practical advantage for processing long email threads or research reports. Mistral NeMo is extraordinarily cheap for routine drafting tasks.

Tier 3: Complex reasoning and code

Recommended: Claude Sonnet 4.6, GPT-5

  • Claude Sonnet 4.6: $3.00/$15.00 per million tokens
  • GPT-5: $1.25/$10.00 per million tokens

These handle: multi-step planning, complex research synthesis, code generation, nuanced judgment calls, anything where getting it wrong has meaningful consequences. This is where you spend money intentionally because the quality difference is real and worth it.

Tier 4: Maximum capability

Recommended: Claude Opus 4.6, GPT-5.4 Pro

  • Claude Opus 4.6: $5.00/$25.00 per million tokens
  • GPT-5.4 Pro: $30.00/$180.00 per million tokens

Use these sparingly, for the hardest tasks in your workflow — strategic planning, high-stakes document generation, complex debugging. At these prices, you want them handling a small fraction of total calls, not being the default.

How fastrouter/auto makes routing decisions

When you set your OpenClaw model to fastrouter/auto, FastRouter analyzes each incoming request before forwarding it to any model. The routing engine examines several signals:

Request complexity signals: Prompt length, vocabulary complexity, presence of multi-step instructions, code blocks, reasoning requirements.

Task type classification: Is this a simple retrieval or formatting task? A creative generation task? A complex reasoning task? A code task?

Your optimization mode: You set this at the account or key level:

  • Cost Optimized: Routes to the cheapest model that can handle the classified task adequately
  • Low Latency: Prioritizes response speed; routes to fastest models per task class
  • High Throughput: Optimizes for volume; balances cost and speed for sustained workloads

Historical performance: Over time, FastRouter incorporates provider availability and latency data into routing decisions, so a provider experiencing degraded performance gets traffic shifted away automatically.

The routing decision itself takes under 1ms. You pay for the model that handles the request; the routing layer is free. If the selected model is unavailable, FastRouter falls back to your specified fallbacks without requiring any code changes on your end.

The result: fastrouter/auto in Cost Optimized mode will route a heartbeat check to GPT-5 nano, a document summary to Gemini Flash, and a multi-step strategy query to Claude Sonnet — all without you configuring any of that explicitly.

Step-by-step: Connecting FastRouter to OpenClaw

There are two paths: the ClawHub skill (recommended — no config file editing, takes about two minutes) and manual configuration. Both work identically at runtime. We'll cover both.

Running OpenClaw on Hostinger

If you're starting fresh, the most commonly recommended setup is OpenClaw on a Hostinger VPS (~$15/month) — this is what the OpenClaw Mastery course uses as its default environment, and Hostinger's Docker Manager makes the deployment genuinely one-click.

Sign up at hostinger.com and provision a VPS. In the Hostinger control panel, navigate to Docker Manager and select One-Click Deploy. Choose the OpenClaw template — the entire installation and container setup runs automatically, no SSH or command-line work required.

Once the container is running, access your Control UI at http://your-vps-ip:18789, connect a channel (Telegram is recommended for mobile access), and you're ready for the FastRouter setup below. Total infrastructure cost: ~$15/month for the VPS, plus API usage.

The fastrouter-setup skill on ClawHub handles the entire configuration automatically — fetching the live model catalog, writing the provider config, and registering models — all triggered from a single chat command.

Step 1: Get your FastRouter API key

Sign up at fastrouter.ai. The free tier is available with up to $6 in credits when you add a credit card. Your key starts with sk-v1-.

Step 2: Install the skill

In your OpenClaw chat interface, type:

1openclaw skills install fastrouter-setup

OpenClaw confirms installation:

1Done.
2Installed: fastrouter-setup@1.2.0
3Location: /data/.openclaw/workspace/skills/fastrouter-setup
4If you want, I can read that skill next and use it to help set up the FastRouter gateway.

Step 3: Run the setup

1setup fastrouter as a provider, ask me for a key

OpenClaw reads the skill and prompts for your API key:

1I'm ready to configure it — I just need your FastRouter API key first.
2Please send: sk-v1-...

Paste your key. OpenClaw then handles everything:

  • Fetches the live FastRouter model catalog
  • Filters to active text and text+image-capable models
  • Adds all models under models.providers.fastrouter
  • Registers them in agents.defaults.models

Confirmation output:

1FastRouter is added to config.
2What happened:
3- fetched the live FastRouter catalog
4- filtered in active text / text+image-capable models
5- added 118 models under models.providers.fastrouter
6- registered them in agents.defaults.models

Step 4: Switch your model

In the OpenClaw Control UI, click the model picker at the top of the chat interface and select fastrouter/auto. Send a test message to confirm the switch. You'll see fastrouter/auto displayed in the model field — you're now routing through FastRouter.

No config file editing, no environment variables to set manually — the skill handles all of it.

Option B: Manual configuration

For users who prefer explicit control over the initial model list:

Set your API key:

1export FASTROUTER_API_KEY="sk-v1-your-key-here"

Add to ~/.openclaw/openclaw.json:

1{
2 "agents": {
3 "defaults": {
4 "model": {
5 "primary": "fastrouter/auto",
6 "fallbacks": [
7 "fastrouter/openai/gpt-5-nano",
8 "fastrouter/deepseek/deepseek-v3-2"
9 ]
10 }
11 }
12 },
13 "models": {
14 "mode": "merge",
15 "providers": {
16 "fastrouter": {
17 "baseUrl": "https://go.fastrouter.ai/api/v1",
18 "apiKey": "${FASTROUTER_API_KEY}",
19 "api": "openai-completions",
20 "models": [
21 { "id": "auto", "name": "FastRouter Auto" },
22 { "id": "openai/gpt-5-nano", "name": "GPT-5 Nano" },
23 { "id": "anthropic/claude-sonnet-4.6", "name": "Claude Sonnet 4.6" },
24 { "id": "anthropic/claude-opus-4.6", "name": "Claude Opus 4.6" },
25 { "id": "google/gemini-3-flash", "name": "Gemini 3 Flash" },
26 { "id": "deepseek/deepseek-v3-2", "name": "DeepSeek V3.2" }
27 ]
28 }
29 }
30 }
31}

Restart and verify:

1openclaw daemon restart
2openclaw models list

FastRouter models should appear alongside any existing providers.

You're now routing through FastRouter. Every subsequent model call goes through the FastRouter gateway before reaching any provider.

Advanced configuration: per-agent model assignment

The default configuration above routes all agents through fastrouter/auto. That's a good starting point. Once you understand your workload — which agents make what types of calls, how often — you can move to explicit per-agent routing.

This is the configuration pattern for a multi-agent production deployment:

1{
2 "agents": {
3 "defaults": {
4 "model": {
5 "primary": "fastrouter/auto"
6 }
7 },
8 "strategy": {
9 "model": {
10 "primary": "fastrouter/anthropic/claude-opus-4.6",
11 "fallbacks": ["fastrouter/anthropic/claude-sonnet-4.6"]
12 }
13 },
14 "dev": {
15 "model": {
16 "primary": "fastrouter/openai/gpt-5",
17 "fallbacks": ["fastrouter/anthropic/claude-sonnet-4.6"]
18 }
19 },
20 "research": {
21 "model": {
22 "primary": "fastrouter/google/gemini-3-flash",
23 "fallbacks": ["fastrouter/deepseek/deepseek-v3-2"]
24 }
25 },
26 "monitoring": {
27 "model": {
28 "primary": "fastrouter/openai/gpt-5-nano",
29 "fallbacks": ["fastrouter/deepseek/deepseek-v3-2"]
30 }
31 },
32 "email": {
33 "model": {
34 "primary": "fastrouter/google/gemini-3-flash",
35 "fallbacks": ["fastrouter/openai/gpt-5-nano"]
36 }
37 }
38 }
39}

The rationale per agent:

  • Strategy agent: This agent makes high-stakes planning decisions. Claude Opus's reasoning depth is worth the premium here — it's handling a small fraction of total calls.
  • Dev agent: GPT-5 consistently outperforms on code generation tasks at a better price point than Opus. Falls back to Sonnet if GPT-5 is unavailable.
  • Research agent: Gemini 3 Flash's massive context window is a real advantage when processing long documents, SEC filings, or extended news archives. Cost-effective for high-volume research calls.
  • Monitoring agent: Heartbeats, status checks, alert triage. GPT-5 nano is completely adequate and costs almost nothing. DeepSeek as fallback maintains availability.
  • Email agent: Drafting and summarization — Gemini Flash handles this well at low cost. Falls back to nano for very simple triage tasks.

The cost math: before and after

Let's work through a realistic scenario: a three-agent OpenClaw deployment (monitoring, email triage, research) making 300 total model calls per day across all agents.

Without routing (all Claude Sonnet 4.6):

Assuming average call size of 2,000 input tokens, 600 output tokens:

  • 300 calls × 2,000 input tokens = 600,000 tokens = $1.80
  • 300 calls × 600 output tokens = 180,000 tokens = $2.70
  • Daily cost: $4.50 / Monthly cost: ~$135

That's already much better than the naive "all-Opus" approach. But let's see what routing does.

With FastRouter routing (cost-optimized):

Distribute calls by type: 120 monitoring/heartbeat calls (nano), 100 email/routine calls (Gemini Flash), 80 research/complex calls (Sonnet 4.6).

  • 120 nano calls: 240,000 input × $0.00005 + 72,000 output × $0.0004 = $0.012 + $0.029 = $0.041
  • 100 Gemini Flash calls: 200,000 input × $0.0005 + 60,000 output × $0.003 = $0.10 + $0.18 = $0.28
  • 80 Sonnet calls: 160,000 input × $0.003 + 48,000 output × $0.015 = $0.48 + $0.72 = $1.20
  • Daily cost: ~$1.52 / Monthly cost: ~$46

Compared to all-Sonnet at $135/month, that's a 66% reduction. Compared to the 50x cost increase users faced after the Anthropic ban, this is the difference between an unaffordable deployment and a very manageable one.

Extend this to a five-agent setup and the absolute numbers grow, but the percentage savings remain similar — in some configurations higher, as heartbeat calls dominate and those route to the cheapest tier.

Setting cost guardrails

One operational concern with autonomous agents is runaway costs — an agent getting stuck in a loop, receiving an unusually high volume of messages, or a skill behaving unexpectedly can drive API costs up sharply without warning.

FastRouter's per-key budget limits are the right mitigation:

  1. In the FastRouter dashboard, navigate to API Keys
  2. Set a daily spending cap per key (e.g., $5–$10/day for a personal deployment)
  3. Configure an alert threshold (e.g., 80% of daily budget) to notify you before the cap is hit
  4. Set a monthly budget cap as a secondary guardrail

When the cap is hit, requests return an error that OpenClaw will handle via its fallback configuration (you can set a very cheap model as the ultimate fallback for budget-exceeded scenarios, so the agent degrades gracefully rather than going fully offline).

This combination — intelligent routing for efficiency plus hard budget caps for safety — gives you the cost control that a subscription model used to provide, at API pricing.

What's in Part 3

In the final installment, we move from infrastructure to execution: what are people actually building with OpenClaw + FastRouter, how do those agents work, and what's the learning path if you want to build serious agent deployments?

We'll cover:

  • The Felix case study: how an OpenClaw agent generated $195K in revenue
  • The $20/month stock monitoring agent — full architecture breakdown
  • Multi-agent business automation patterns (email, client onboarding, sales)
  • The "OpenClaw Mastery for Everyone" 10-day learning path
  • How to think about model selection as your agent workloads mature

Stay tuned for part 3...

FastRouter.ai is an OpenAI-compatible LLM API gateway providing access to 134+ models with intelligent routing, automatic failover, and transparent pricing. This post reflects pricing and features as of April 2026.

Related Articles

Passing Evals Aren't a Quality Signal
Passing Evals Aren't a Quality Signal
Evals

Passing Evals Aren't a Quality Signal

A high eval pass rate tells you your test set is easy, not that your system is working. A practitioner argument for adversarial evaluation, done right

S
Siv Souvam
1 Min ReadApril, 22 2026