A single file weighing 59.8 MB recently appeared on the npm registry, and it belonged to Anthropic. In the world of software deployment, this is the equivalent of a chef serving a gourmet meal to a customer but accidentally leaving the entire kitchen's secret recipe book and architectural blueprints on the plate. While it seemed like a clumsy administrative error, this leak was not an isolated incident. It was a symptom of a systemic vulnerability that has plagued the most powerful AI labs in the world.

Over a period of just 50 days, OpenAI, Anthropic, and Meta fell victim to a series of supply chain attacks. The most striking detail of these breaches is that the attackers did not attempt to jailbreak the models, bypass safety filters, or trick the AI into hallucinating. They ignored the intelligence of the models entirely. Instead, they targeted the plumbing—the deployment pipelines that move code from a developer's laptop to the end user. While the industry's red teams spent thousands of hours testing the boundaries of model safety, the trucks delivering those models were arriving with their back doors wide open.

The 50-Day Breach Cycle from LiteLLM to Mini Shai-Hulud

The onslaught began between March 24 and March 27, when a hacking group known as TeamPCP initiated a sophisticated chain of compromise. The attackers first stole credentials for Trivy, a widely used vulnerability scanner, and used that access to inject malicious code into the Python package of LiteLLM, a proxy gateway used to manage multiple LLMs. This was a classic poisoning of the well; rather than attacking the final product, they contaminated the ingredients. The compromised package sat on PyPI for approximately 40 minutes, but that window was enough to facilitate nearly 47,000 downloads. The ripple effect extended to Mercor, an AI data startup, resulting in the theft of 4TB of data, which included proprietary training methodologies belonging to Meta. A single vulnerability in an open-source dependency created a catastrophic blast radius across the AI ecosystem.

By March 30, the focus shifted to OpenAI Codex. A vulnerability was discovered where the system passed GitHub branch names directly into shell commands. An attacker could craft a branch name containing special characters like semicolons or backticks, which the system would interpret as executable commands rather than simple text. This flaw allowed attackers to expose the OAuth tokens of victims, effectively stealing the digital keys used for user authentication. This highlighted a fundamental failure in input sanitization, where a simple string of text was allowed to execute arbitrary code on the server.

On March 31, Anthropic suffered a different kind of failure. A version of Claude Code 2.1.88 accidentally leaked a 59.8 MB source map file. Source maps are essentially translation guides that allow developers to map minified, machine-readable code back to the original human-readable source. This single file exposed 513,000 lines of TypeScript source code across 1,906 different files. The leak revealed the internal orchestration logic and system prompts—the very blueprints of how Claude operates—without requiring any authentication. This was not the result of a sophisticated hack but a simple human error: a missing entry in the `.npmignore` file.

The cycle culminated between May 11 and May 14 with the emergence of a worm dubbed Mini Shai-Hulud. This malware rapidly contaminated over 160 packages, including the popular UI library TanStack. The worm exploited configuration errors in GitHub Actions and cache poisoning to hijack trusted deployment paths. It even created a fake identity, `claude <[email protected]>`, to bypass code reviews by mimicking a trusted entity. The worm eventually infected two devices belonging to OpenAI employees. Despite OpenAI's efforts to harden its CI/CD pipelines following previous attacks, these specific devices had not yet received the updated security configurations.

The Blind Spot of Model Red Teaming

The Mini Shai-Hulud worm demonstrated a terrifying efficiency, injecting malicious code into 42 npm packages in just six minutes. This attack succeeded not because of a flaw in an AI model, but because of a gap in infrastructure automation. Specifically, the worm exploited the `pull_request_target` setting in GitHub Actions to poison the build cache and extract OIDC (OpenID Connect) tokens from the runner's memory. This is the digital equivalent of guarding the front gate of a fortress with an army while leaving the delivery entrance unlocked. Most alarming was the fact that these malicious files were generated even though the environment had passed SLSA Build Level 3, a rigorous software supply chain security standard. This proves that the current trust model—which assumes that a legitimate repository and a valid token equal a secure build—is fundamentally flawed.

Small infrastructure gaps consistently lead to massive exposures. In Anthropic's case, the absence of a single line in a `.npmignore` file stripped away the privacy of half a million lines of code. In the OpenAI Codex incident, the use of Unicode characters allowed attackers to create branch names that looked identical to `main` to the human eye but were interpreted as malicious commands by the machine. These incidents reveal that a minor configuration oversight or a visual trick is far more dangerous than a model's lack of reasoning capability. The vulnerability is not in the intelligence, but in the delivery.

This exposes a critical divide between model red teaming and deployment red teaming. Traditional model red teams focus on the AI's output, attempting to induce prohibited responses or identify biases to create a system card. However, the recent breaches show that the real danger lies outside the model's boundary. A deployment red team must instead scrutinize the trust boundaries of CI runners and the integrity of packaging gates. No matter how ethical or safe a model is designed to be, if the conveyor belt delivering that model to the user is contaminated, the final product is effectively a poisoned apple.

OpenAI attempted to address the broader security landscape on May 10 by launching Daybreak, a cybersecurity initiative powered by GPT-5.5 and GPT-5.5-Cyber. The goal was to use state-of-the-art AI to automate the discovery and mitigation of vulnerabilities. Yet, the irony was immediate: the very next day, the TanStack worm infected OpenAI's own internal devices. The world's most advanced security AI was released just as the company's basic deployment hygiene failed. This serves as a stark reminder that model safety and infrastructure security are two entirely different disciplines.

In the wake of the infections, OpenAI was forced to revoke its macOS security certificates and mandate updates for all desktop users through June 12, 2026. This action was a digital emergency reset, akin to changing every lock in a building because the master key was leaked. Because the security certificate—the digital seal of authenticity—was compromised, the company had to force a global update to restore trust. This process has nothing to do with the intelligence of GPT-5.5 and everything to do with basic infrastructure management.

For engineers and security practitioners, these events underscore the necessity of adhering to the NIST SSDF (Secure Software Development Framework). Specifically, the industry must prioritize PS.1.1, which focuses on preventing unauthorized access to source code, and PS.2.1, which ensures that released artifacts are verified and untampered. The lesson is clear: locking the door to the code repository and verifying the authenticity of the final package is just as important as the safety training of the model itself.

Investment in the intelligence of AI must be matched by an equal investment in the security of the pipelines that deploy it.