Back
Building Safer AI Applications: A Practical Guide to Guardrails in FastRouter.ai

Building Safer AI Applications: A Practical Guide to Guardrails in FastRouter.ai

When you deploy an AI-powered application, you're not just shipping a feature—you're establishing trust.

F
FastRouter Team
3 Min Read|Latest — February 27, 2026
Building Safer AI Applications: A Practical Guide to Guardrails in FastRouter.ai

When you deploy an AI-powered application, you're not just shipping a feature—you're establishing trust. Whether you're building a customer support chatbot, a content generation tool, or an internal knowledge assistant, one wrong response can expose sensitive data, alienate users, or damage your brand reputation.

This is where guardrails become essential. Think of them as your AI safety net: intelligent checkpoints that validate both incoming requests and outgoing responses before they reach users. In this guide, we'll walk through how to implement guardrails in FastRouter.ai using a real-world scenario that demonstrates their practical value.

Why Guardrails Matter: The Real Risks

Large Language Models are powerful, but they're unpredictable. Without proper controls, they can:

  • Leak sensitive information like email addresses, phone numbers, or credit card details
  • Veer off-topic into areas your application shouldn't address
  • Generate toxic content in response to adversarial prompts
  • Produce malformed outputs that break downstream systems

Guardrails give you programmatic control over these risks without sacrificing the flexibility that makes LLMs useful.

Our Use Case: A Healthcare Appointment Assistant

Let's build "MediSchedule," an AI assistant for a medical practice that helps patients:

  • Schedule, reschedule, and cancel appointments
  • Get office location and hours information
  • Understand insurance and payment options

This scenario is high-stakes because:

  1. Privacy is non-negotiable – We cannot expose Protected Health Information (PHI)
  2. Scope must be limited – The bot shouldn't provide medical advice
  3. User experience matters – We need safety without frustrating legitimate users

Understanding Guardrail Modes: Observe vs. Validate

FastRouter offers two operating modes that determine how your application behaves when a guardrail detects an issue:

Observe Mode (Status Code 246)

The guardrail checks run, but failures don't block requests. You receive full logging about what triggered, while users experience no disruption. This is your testing phase—understanding what would be blocked before you actually enforce it.

Use Observe when:

  • Rolling out new guardrails for the first time
  • Testing configurations on production traffic
  • Gathering data to tune sensitivity

Validate Mode (Status Code 446)

When a guardrail fails, the request is immediately blocked. Your application receives an error instead of a potentially problematic response.

Use Validate when:

  • Compliance requirements are absolute
  • You've tested thoroughly in Observe mode
  • The risk of a false positive is acceptable

Important: Always start in Observe mode on a subset of traffic. This data-driven approach prevents you from accidentally blocking legitimate user interactions.

Implementation: Step-by-Step

Step 1: Create Your Guardrails

In the FastRouter dashboard, navigate to GuardrailsBrowse Templates and create:

1. Input Topic Adherence (Observe Mode)

  • Name: "Appointment Topics Only"
  • Action: Observe
  • Stage: Input
  • Allowed topics: Appointment scheduling, office information, insurance, payments
  • Blocked topics: Medical diagnosis, treatment advice, medication information
  • Config ID: gr_topic_apt

2. Input & Output PII Protection (Validate Mode)
  • Name: "PHI Detector"
  • Action: Validate
  • Stage: Input and Output
  • Detection types: Email addresses, phone numbers, SSN, medical record numbers
  • Config ID: gr_pii_protect

3. Output Toxicity Filter (Validate Mode)

  • Name: "Toxic Content Blocker"
  • Action: Validate
  • Stage: Output
  • Config ID: gr_toxic_block

4. Output Format Validator (Validate Mode)

  • Type: RegEx Check
  • Name: "JSON Structure Check"
  • Pattern: Ensures response contains required fields for UI rendering
  • Config ID: gr_json_format

Step 2: Integrate Into Your API Calls

Here's how to wire guardrails into a typical request:

1 body: JSON.stringify({
2 model: "openai/gpt-4",
3 stream: false, // Guardrails require full-response evaluation and may not be compatible with streaming depending on configuration
4 messages: [
5 {
6 role: "system",
7 content: "You are a helpful appointment scheduling assistant for MediSchedule Medical Center. Help patients with appointments, office info, and general questions. Do not provide medical advice."
8 },
9 {
10 role: "user",
11 content: userMessage
12 }
13 ],
14 input_guardrails: ["gr_topic_apt", "gr_pii_protect"],
15 output_guardrails: ["gr_pii_protect", "gr_toxic_block", "gr_json_format"]
16 });
17

Step 3: Handle Status Codes Appropriately

Your application needs logic to interpret guardrail results:

1const data = await response.json();
2
3switch(response.status) {
4 case 200:
5 // All guardrails passed
6 return {
7 success: true,
8 message: data.choices[0].message.content,
9 guardrails: data.guardrails
10 };
11
12 case 246:
13 // Observe mode: guardrail triggered but request processed
14 logGuardrailWarning(data.guardrails);
15 return {
16 success: true,
17 message: data.choices[0].message.content,
18 warning: "Guardrail triggered - review logs"
19 };
20
21 case 446:
22 // Validate mode: request blocked
23 return {
24 success: false,
25 userMessage: "I apologize, but I can only assist with appointment scheduling and office information. For medical questions, please contact our nurse line at (555) 867-5309."
26 };
27}
28

Real-World Scenarios

Scenario 1: Normal Appointment Request (Status 200)

User: "I need to schedule a physical exam for next Tuesday morning."

Guardrail Flow:

  1. Input check (Topic Adherence): ✅ Passes – legitimate scheduling request
  2. LLM generates response: "I'd be happy to help schedule your physical exam. We have availability next Tuesday at 9:00 AM or 10:30 AM. Which works better for you?"
  3. Output checks:
    • PII Protection: ✅ No sensitive data
    • Toxicity: ✅ Appropriate content
    • JSON Format: ✅ Properly structured

Response: Status 200 with full message delivered to user.

Scenario 2: Off-Topic Request in Observe Mode (Status 246)

User: "What medication should I take for my headache?"

Guardrail Flow:

  1. Input check (Topic Adherence): ⚠️ Fails – medical advice request detected
  2. Because we're in Observe mode: Request proceeds to LLM
  3. LLM response: "I'm designed to help with appointment scheduling. For medical advice, please consult with your healthcare provider. Would you like to schedule an appointment to discuss your symptoms?"

Response: Status 246. The response is delivered, but your logs flag that topic adherence was violated. After reviewing patterns, you might:

  • Adjust your system prompt
  • Switch to Validate mode to block these requests
  • Add proactive messaging in your UI

Scenario 3: PII Leak in Validate Mode (Status 446)

User: "Can you send my lab results to [email protected]?"

Guardrail Flow:

  1. Input PII check: ❌ Fails – email address detected
  2. LLM is not invoked: Request is blocked before reaching the model
  3. No output guardrails executed: Because generation never occurs
  4. Because we're in Validate mode: Request is blocked immediately

Response: Status 446 with error. Your application shows:
"For your security, I cannot process requests involving personal contact information. Please discuss lab results directly with your healthcare provider or through your patient portal."

The potentially sensitive request never reaches the model.

Basic Guardrails (RegEx)

  • Latency: 10-50ms
  • Cost: Negligible
  • Use for: Format validation, simple pattern matching

LLM Judge Guardrails (PII, Topic, Toxicity)

  • Latency: 500ms-2s per check
  • Cost: Consumes tokens from your Default Organization Key
  • Use for: Semantic understanding, context-aware validation

Optimization Strategy:

  • Apply fast regex checks first
  • Layer expensive LLM checks only where necessary
  • For low-risk queries (like "What are your hours?"), consider skipping heavy guardrails

Best Practices for Production

1. Progressive Rollout

  • Week 1-2: Observe mode on 10% of traffic
  • Week 3-4: Analyze logs, tune configurations
  • Week 5+: Gradually enable Validate mode on critical guardrails

2. Layer Your Defense

  • Combine deterministic (regex) and intelligent (LLM Judge) checks
  • Don't rely on a single "master guardrail"
  • Create focused guardrails for specific concerns

3. User-Friendly Error Handling Never expose technical guardrail failures to users. Instead of "Guardrail validation failed: PII detected," show contextual guidance like "For your privacy, please don't share personal information in chat. How else can I help you?"

4. Monitor and Iterate Set up alerts for:

  • High failure rates (>5% might indicate misconfiguration)
  • New failure patterns (emerging user behaviors)
  • Cost spikes from LLM Judge guardrails

5. Test Edge Cases Build a test suite covering:

  • Legitimate requests that might trigger false positives
  • Adversarial prompts attempting to bypass guardrails
  • Boundary cases in your allowed topics

The Complete Response Object

When guardrails run, FastRouter returns detailed metadata:

1{
2 "id": "fr_abc123",
3 "model": "gpt-4",
4 "choices": [...],
5 "guardrails": {
6 "input": {
7 "passed": true,
8 "checks": [
9 {
10 "id": "gr_topic_apt",
11 "name": "Appointment Topics Only",
12 "passed": true,
13 "cost": 0.00011
14 }
15 ]
16 },
17 "output": {
18 "passed": true,
19 "checks": [
20 {
21 "id": "gr_pii_protect",
22 "name": "PHI Detector",
23 "passed": true,
24 "cost": 0.00012
25 },
26 {
27 "id": "gr_toxic_block",
28 "name": "Toxic Content Blocker",
29 "passed": true,
30 "cost": 0.00010
31 },
32 {
33 "id": "gr_json_format",
34 "name": "JSON Structure Check",
35 "passed": true,
36 "cost": 0.0
37 }
38 ]
39 }
40 }
41}
42

This transparency lets you track performance, costs, and reliability at a granular level.

Conclusion: From Experimentation to Trust

Guardrails transform LLMs from unpredictable experimental tools into reliable production systems. They let you:

  • Ship faster with confidence that risks are managed
  • Scale safely without proportional increases in human oversight
  • Meet compliance requirements with auditable controls
  • Build user trust through consistent, appropriate interactions

The key is starting thoughtfully: observe before you validate, test incrementally, and let real usage data guide your configuration. With FastRouter's guardrails, you're not just hoping your AI behaves—you're ensuring it.

Ready to add guardrails to your application? Log into the FastRouter dashboard and create your first guardrail in under five minutes. Your users—and your risk management team—will thank you.

Related Articles

Building Safer AI Applications: A Practical Guide | Fastrouter Blog