There's something deeply ironic about what happened to LiteLLM on March 24.

LiteLLM is, by design, a credential proxy. It manages API keys for every LLM provider your organization uses. OpenAI, Anthropic, Google, Cohere, dozens more. All those keys flow through one package.

That package got compromised. A threat actor published poisoned versions to PyPI that harvested every secret they could find on the host machine. SSH keys, cloud credentials, environment variables, Kubernetes configs, crypto wallets. The malware encrypted it all and shipped it to an attacker-controlled server. If it found a Kubernetes service account, it tried to spread to every node in the cluster.

The cruel part: the one package that handles all your LLM API keys was the one that got turned into a credential stealer. The attacker didn't need to figure out where your secrets lived. LiteLLM already knew.

We're not writing this to dunk on LiteLLM's team. They're dealing with a supply chain attack that originated from a compromised security scanner (Trivy) five days earlier. The attacker stole a PyPI publishing token from LiteLLM's CI/CD pipeline and used it to push the poisoned versions directly. The maintainers didn't upload the bad code. Their build tooling was weaponized against them.

But this incident forces a question that's been lurking for a while: should your LLM gateway be a pip dependency?

The dependency problem

Karpathy put it bluntly in his thread about the attack. Supply chain attacks are "basically the scariest thing imaginable in modern software." Every time you install a dependency, you're trusting the entire tree beneath it. LiteLLM doesn't just affect the people who installed it directly. It's a transitive dependency for projects like DSPy, MLflow, and many others. You might never have typed pip install litellm and still gotten hit.

This is the fundamental tension with open-source LLM proxies. They're powerful, they're flexible, and they sit inside your environment with access to everything. The same design that makes them useful (they handle your credentials so you don't have to manage multiple SDKs) makes them catastrophic when compromised.

A different architecture

We built FastRouter as a hosted API endpoint for exactly this reason. There's no pip package. No dependency tree. No code running on your machine. You point your existing OpenAI SDK at go.fastrouter.ai/api/v1 instead of api.openai.com/v1, and you're done.

Your LLM API keys live on our servers, managed with per-project budgets, RBAC, and per-key rate limits. Developers on your team get a FastRouter API key. They never see or handle the underlying provider keys. If someone's laptop gets compromised, the blast radius doesn't include your OpenAI, Anthropic, and Google credentials.

This isn't theoretical anymore. The LiteLLM attack showed that a compromised LLM proxy puts every key it manages at risk.

What to do this week

If you were running LiteLLM 1.82.7 or 1.82.8, assume everything is compromised. Rotate all credentials. Check for persistence mechanisms. Audit your Kubernetes clusters.

If you're now evaluating alternatives, here's the honest trade-off:

Self-hosted open-source proxies give you full control over the code. You can inspect it, modify it, run it wherever you want. The cost is that you own the supply chain risk, the patching, and the operational burden. This week showed what that cost looks like when things go wrong.

Managed gateways give you less control over the underlying code but externalize the supply chain risk entirely. Your attack surface is one API key, not an entire dependency tree.

We think that trade-off tilts pretty clearly toward managed gateways for production workloads. But we're biased, so here's a better way to decide: route your existing traffic through FastRouter for seven days. Same API format. One line change. We'll generate a report showing your cost savings, reliability gaps, and latency wins. If you don't like what you see, swap the URL back.

fastrouter.ai/audit

Your LLM Gateway Shouldn't Be a Pip Dependency

Related Articles

Tokenmaxxing Is a Governance Problem, Not a Productivity Problem

AI Spend Management: What Engineering Leaders Need to Get Right in 2026

Your Prompts Are Hardcoded Strings and It's Costing You Hours Every Week