Tracing & Observability

Trace every request span by span

FastRouter groups multi-step workflows into a single trace using the W3C traceparent header-so you can inspect prompts, completions, latency, token usage, and cost across every model, provider, and tool call, with no SDK to install.

No credit card required · Free to start

Trace
trace_id4bf92f35…0e4736
4 spans2.40s
agent-run
2.40s
POST /v1/chat/completions
1.10s
github-mcp__create_issue
0.38s
POST /v1/chat/completions
0.90s
Each spanlatencytokenscost
Why tracing

Observability that follows the whole request

See past a single completion. FastRouter stitches every model call, tool call, and retry into one trace you can read end to end.

Group anything into one trace

Reuse a single traceparent header and every related request-agent steps, chat turns, or parallel calls-collapses into one trace of ordered spans.

Full span-level detail

Each span captures latency, token usage, cost, and the complete request and response, so you can inspect exactly what happened at every step.

Standards-based, no SDK

Built on the W3C Trace Context standard, tracing works with any HTTP client-no proprietary SDK or instrumentation to maintain.

How it works

One traceparent, one trace, many spans

Send the W3C traceparent header with your requests. FastRouter extracts the trace, records each span with full detail, and groups everything under a single trace_id.

Your application

Send a traceparent

traceparentparent_id
  • Generate a header: version-trace_id-parent_id-flags.
  • Reuse the same traceparent across every request in the workflow.

FastRouter

Gateway records the span

Extract trace_idNew span_id
  • Captures latency, tokens, cost, and the full request and response.
  • Names the span POST /api/v1/chat/completions and stores it.

Logs & Traces

Spans grouped into one trace

Grouped by trace_idOrdered spans
  • Every call sharing a trace_id appears as a single trace.
  • Inspect ordered spans across models, providers, and steps.

traceparent header

00-4bf9…e4736-00f0…02b7-01
  • versionFixed · W3C standard
  • trace_id32 hex · groups the trace
  • parent_id16 hex · your caller span
  • flagsSampling flag

Standards-based by design

Tracing is built on the W3C Trace Context standard, so it works with any HTTP client and needs no SDK. Optional headers let you label spans and control their IDs for cleaner, more readable traces.

  • x-span-name sets a human-readable label for a span.
  • x-span-id sets a custom span ID, auto-generated if omitted.
  • x-trace-id overrides the trace_id from the traceparent header.
Span-level detail

Inspect every span, end to end

Open any span to see the model and provider, exact latency, token usage, cost, and the full request and response-no guessing about what the model actually saw or returned.

Latency per span

Every span records how long it took, so you can pinpoint the slow step in a chain.

Tokens and cost

See token usage and cost on each call to understand where spend accumulates.

Full request and response

Captured prompts and completions let you replay and debug exactly what happened.

Span detail

span_id a1b2c3d4e5f6a7b8

200 OK
Latency

1.10s

Tokens

1,284

512 in · 772 out

Cost

$0.0043

Model

gpt-4.1

Request · user

"Summarize the open issues and file a tracking ticket."

Response · assistant

"Filed FR-482 covering 3 open issues across the gateway repo."

Grouping

Group every step into a single trace

Reuse the same traceparent across a workflow and every request lands in one ordered trace-ideal for agentic chains, multi-turn chat sessions, and bursts of parallel calls.

Ordered spans

All calls sharing a trace_id appear together as a single, ordered trace.

Reuse one header

Generate a traceparent once and pass it on every request in the workflow.

Built for agents

Trace multi-step tool and function calls the same way you trace a single completion.

One trace, many spans

Same traceparent
Trace4bf92f35…0e4736
  • ├──POST /api/v1/chat/completions· gpt-4.11.10s
  • ├──github-mcp__create_issue· tool call0.38s
  • └──POST /api/v1/chat/completions· retry0.90s
Agentic chains, chat turns, and parallel calls collapse into one ordered trace.
Standards-based

Works with any HTTP client

Tracing uses the open W3C Trace Context standard, so there is nothing proprietary to install. Optional headers let you name spans and set your own span or trace IDs.

No SDK required

Add a single header from cURL, the OpenAI SDKs, or any HTTP client you already use.

Readable span names

Set x-span-name to label spans like hotel-search for faster scanning.

Custom span and trace IDs

Provide x-span-id or x-trace-id to align traces with your own systems.

POST /v1/chat/completions
cURL
curl https://api.fastrouter.ai \
-H "Authorization: Bearer ••••" \
-H "traceparent: 00-4bf9…-00f0…-01" \
-H "x-span-name: hotel-search" \
-H "x-span-id: a1b2c3d4e5f6a7b8"
Any HTTP clientNo SDK required
Use cases

From a single call to complex agent runs

Whatever shape your workload takes, tracing keeps the full picture in one place-so debugging, optimizing, and explaining behavior gets dramatically easier.

Debug agentic workflows

Follow multi-step chains with tool and function calls across one trace to find exactly where an agent went wrong.

Trace full chat sessions

Link every turn of a conversation under one trace to see how context, latency, and cost build over time.

Group parallel requests

Bundle concurrent calls into a single trace so fan-out workloads stay easy to reason about.

Attribute latency and cost

Spot the slowest spans and the calls driving token usage so you can optimize with confidence.

FAQ

Tracing questions, answered

FastRouter follows the W3C Trace Context standard. When you send a request with a traceparent header, the gateway extracts the trace_id to group related requests, records your parent_id as the caller span, generates a new span_id, captures latency, tokens, cost, and the full request and response, then stores the span under the matching trace.

The header is version-trace_id-parent_id-flags. version is 00 (fixed by the W3C standard), trace_id is 32 hex characters that identify the whole trace, parent_id is the 16-hex-character ID of your application's caller span, and flags is the sampling flag, for example 01. A full example is 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01.

Every span records latency, token usage, cost, and the full request and response. Gateway spans are named after the endpoint, such as POST /api/v1/chat/completions, so you can scan a trace and immediately see which calls ran.

No. Tracing is built on the open W3C Trace Context standard and works with any HTTP client-cURL, the OpenAI SDKs, or your own code. You just generate a traceparent header and reuse it across the requests you want to group.

Reuse the exact same traceparent header on every request in a workflow. All calls that share the same trace_id appear as a single trace with ordered spans, which is ideal for agentic chains, multi-turn chat sessions, and parallel requests.

Yes. Optional headers give you control: x-span-name sets a human-readable label like hotel-search, x-span-id sets a custom span ID (auto-generated if omitted), and x-trace-id overrides the trace_id from the traceparent header.

See exactly what your models are doing

Add a traceparent header, reuse it across your workflow, and watch every span-prompts, latency, tokens, and cost-line up in one trace.