Post

The LiteLLM Supply Chain Attack — What AI-Dependent Codebases Should Audit

The LiteLLM Supply Chain Attack — What AI-Dependent Codebases Should Audit

On March 24, 2026, two malicious versions of LiteLLM were published to PyPI.

If you haven’t heard of LiteLLM, here’s why you should care: it’s one of the most popular LLM proxy libraries in the Python ecosystem. It provides a unified interface for calling OpenAI, Anthropic, Google, Azure, and dozens of other LLM providers through a single API. Thousands of companies use it as middleware between their applications and AI models.

And for a few hours on a Monday morning, installing it meant getting backdoored.


What Actually Happened

A maintainer’s PyPI account was compromised — likely by the threat actor group known as TeamPCP. The attackers published LiteLLM versions 1.82.7 and 1.82.8 directly to PyPI, bypassing the normal GitHub release process entirely. No corresponding tag or release existed in the GitHub repository.

The malicious versions contained a .pth file — litellm_init.pth — which is a Python mechanism that executes code automatically on every Python process startup when the package is installed. Not when you import it. Every time Python starts.

The payload operated in three stages:

Stage 1: Collection

A script harvested sensitive files from the host machine:

  • SSH private keys and configs
  • .env files
  • AWS, GCP, and Azure credentials
  • Kubernetes configs and service account tokens
  • Database passwords
  • Shell history
  • Crypto wallet files
  • Anything matching common secret file patterns

It also ran commands to dump environment variables and query cloud metadata endpoints (IMDS, container credentials).

Stage 2: Exfiltration

The collected data was encrypted with a hardcoded 4096-bit RSA public key using AES-256-CBC, bundled into a tar archive, and POSTed to models.litellm.cloud — a domain controlled by the attackers that looked plausibly legitimate.

Stage 3: Lateral Movement

If a Kubernetes service account token was present, the malware read all cluster secrets across all namespaces and attempted to create privileged alpine:latest pods on every node in kube-system. Each pod mounted the host filesystem and installed a persistent backdoor via systemd.

On the local machine, it attempted the same persistence through ~/.config/sysmon/sysmon.py.

This is not a theoretical attack. This is a real, three-stage compromise that targeted every machine running pip install litellm during the window.


The Ironic Bug

Here’s a darkly funny detail: the .pth file triggers on every Python process startup, including child processes spawned by the malware itself. This created an exponential fork bomb that crashed machines — which is actually what led to the attack being discovered quickly. The malware had a bug that made it too visible.

If the attackers had been more careful, the compromise could have stayed hidden for days or weeks.


Why This Hits Different for AI Codebases

Supply chain attacks on PyPI aren’t new. Typosquatting, dependency confusion, compromised maintainers — we’ve seen all of these before.

But this one is different because of what LiteLLM is and where it sits in the stack.

LiteLLM Is Infrastructure

LiteLLM isn’t a utility library. It’s the layer that handles all communication between your application and AI models. It sees every prompt, every response, every API key. Compromising LiteLLM gives an attacker access to the most sensitive parts of an AI system.

AI Codebases Have Unusually Broad Access

Applications that use LLMs tend to have access to a lot of things:

  • Customer data (for context windows)
  • Internal APIs (for tool calling)
  • Cloud credentials (for model provider access)
  • Kubernetes secrets (for deployment)

A compromised LLM middleware library has a blast radius that extends far beyond the application itself.

The Dependency Chain Is Deep

Many teams don’t install LiteLLM directly. It comes in as a transitive dependency — through MCP plugins, AI agent frameworks, or internal tools that depend on it. The team that discovered the attack at FutureSearch found it because LiteLLM was pulled in as a transitive dependency of a Cursor MCP plugin. They didn’t even know it was in their dependency tree.


The Container Angle

This attack is a textbook argument for the container isolation philosophy.

If your AI agent or LLM-powered application runs inside a container with:

  • No access to the host filesystem
  • No access to SSH keys, .env files, or cloud credentials
  • No Kubernetes service account token mounted
  • Network access restricted to known endpoints
  • A read-only root filesystem

Then none of the three attack stages work.

The collection stage finds nothing to steal. The exfiltration stage can’t reach the attacker’s server. The lateral movement stage has no service account token and no host filesystem to persist on.

This is not about preventing the compromise from entering your system — it’s about ensuring that when it does (and eventually it will), the damage is contained.

The container boundary is your blast radius limiter.


What You Should Actually Audit

If you’re running any AI-dependent codebase in Python, here’s a concrete checklist:

1. Check for LiteLLM Specifically

1
2
3
pip show litellm
pip list | grep litellm
find ~/.cache -name "litellm_init.pth"

If you installed version 1.82.7 or 1.82.8 between March 24-25, 2026, assume compromise. Rotate all credentials that were accessible from that environment.

2. Audit Your Transitive Dependencies

1
2
pip install pipdeptree
pipdeptree --reverse --packages litellm

Know why LiteLLM is in your dependency tree. If it’s coming through a framework you don’t control, evaluate whether you actually need that framework.

3. Pin Your Dependencies

If you’re still using unpinned or loosely pinned versions in production, stop. Use lockfiles. Use hash verification.

1
litellm==1.82.6 --hash=sha256:abc123...

A compromised version with a pinned hash simply won’t install.

4. Monitor PyPI Releases Against GitHub

The attack was detectable because the PyPI release had no corresponding GitHub tag. Automated monitoring of this discrepancy would have caught it immediately.

5. Run AI Workloads in Isolated Environments

This is the structural fix. Don’t give your LLM middleware access to SSH keys and Kubernetes secrets. It doesn’t need them. The principle of least privilege applies to AI infrastructure just like everything else.


The Broader Pattern

LiteLLM won’t be the last AI middleware to get compromised.

The AI ecosystem is moving fast. New libraries appear weekly. Maintainers are often individuals or small teams. The attack surface is enormous and growing.

What’s different about AI supply chain attacks is the leverage. Compromising a utility library is bad. Compromising the layer that sits between applications and AI models — the layer that sees every prompt, every API key, every customer query — is catastrophic.

The defenses are not new:

  • Pin dependencies
  • Verify hashes
  • Run in isolated environments
  • Monitor for unauthorized package releases
  • Apply least privilege

But in 2026, with AI middleware becoming critical infrastructure, the cost of not doing these things just went up dramatically.


Final Thought

The LiteLLM attack lasted a few hours before being discovered — partly due to the attackers’ own bug crashing machines.

Next time, the malware won’t have a fork bomb bug.

Next time, it might sit quietly in your pipeline for weeks, reading every prompt and every API key, before anyone notices.

The question isn’t whether your AI dependencies will be targeted.

The question is whether your environment is set up so that it doesn’t matter when they are.

This post is licensed under CC BY 4.0 by the author.