Why Anthropic Skill Scanners Miss the .test.ts Security Gap

A developer browses ClawHub, the AI Skill marketplace, and finds a promising tool to extend their AI agent's capabilities. They run the installation command, and the security scanner immediately begins its work. It parses the markdown instructions, scans for prompt injection patterns, and checks for known malicious strings. Every indicator turns green. The tool is marked as safe. However, the moment the installation completes, a hidden file named `.test.ts` tucked away in a subfolder triggers. Without a single warning, the system's SSH keys and environment variables are silently packaged and transmitted to a remote server.

The Hidden Execution Path and the 13.4% Risk

This vulnerability exists because of a fundamental blind spot in how AI skills are installed and verified. Jeevan Jutla, a researcher at Gecko Security, demonstrated that the risk originates during the process of adding a skill via `npx`. When a developer installs a skill, the installer copies the entire directory of the skill repository into the local storage. While the security scanners focus on the primary logic and instructions, they often overlook the test suite. Files ending in `.test.ts` are automatically discovered by common JavaScript testing frameworks like Jest or Vitest. These tools use recursive search patterns to find any test file within the project structure, including those buried inside the `.agents` folder.

The attack is particularly effective because the malicious payload does not reside in the test logic itself. Instead, it is placed within the `beforeAll` block. This block executes immediately when the test runner initializes, long before any actual validation or assertion takes place. By the time a developer realizes a test has failed or passed, the payload has already executed its primary objective.

To quantify this threat, the SkillScan research team conducted a massive analysis on January 15, examining 31,132 unique skills collected from two major marketplaces. The results revealed a systemic issue: 26.1% of all skills contained one of 14 identified vulnerability patterns. Specifically, vulnerabilities related to data exfiltration accounted for 13.3% of the samples, while privilege escalation vulnerabilities were found in 11.8%. The research also highlighted a correlation between the type of skill and its risk profile; skills that included executable scripts were 2.12 times more likely to contain vulnerabilities than those consisting solely of instructions.

Further evidence emerged on February 5 through the ToxicSkills report published by Snyk. After a comprehensive audit of 3,984 skills, Snyk found that 13.4% of the skills suffered from severe security flaws. Through a combination of automated scanning and manual expert review, the team identified 76 distinct malicious payloads. Alarmingly, eight of these malicious skills remained available for download on ClawHub at the time the report was published.

Industry giants have attempted to address this, but the gap remains. On April 21, Cisco released an AI Agent Security Scanner integrated into VS Code, Cursor, and Windsurf. While this tool is highly effective at detecting threats within the agent interaction layer, it continues the industry trend of excluding bundled test files from its scan perimeter, leaving the `.test.ts` vector wide open.

The Divergence Between Agent and Developer Execution Surfaces

The core of the problem lies in a misunderstanding of the attack surface. Most existing security tools, including Snyk Agent Scan and VirusTotal Code Insight, are designed to protect the Execution Surface—the point where the AI agent actually operates. These tools are proficient at catching prompt injections or unauthorized shell command executions triggered by the model. However, the `.test.ts` attack does not target the AI agent; it targets the developer's toolchain.

This represents a shift in the trust model. An attacker provides a pristine `SKILL.md` file that passes every security check, creating a false sense of security. While the scanner is preoccupied with the safety of the agent's instructions, the test runner is executing a file like `tests/reviewer.test.ts`. This script operates with the full permissions of the local user, allowing it to access `process.env` for environment variables, read private keys from the `~/.ssh/` folder, or steal cloud credentials from `~/.aws/credentials`.

This risk is amplified by the way modern development teams collaborate. The `.agents/skills/` directory is designed to be committed to Git repositories so that team members can share a consistent set of agent capabilities. However, the default `.gitignore` templates provided by GitHub do not include the `.agents/` folder. If a single developer unknowingly adds a malicious skill and commits it to the repository, every other developer who clones that repo—and every CI/CD pipeline that runs the tests—becomes a victim of the payload.

This pattern is not exclusive to the JavaScript ecosystem. Similar exposure exists in Python environments. The `pytest` framework automatically executes `conftest.py` during the test collection phase. To prevent this specific vector in Python projects, developers must explicitly add the `.agents` directory to the `testpaths` exclusion list within the `pyproject.toml` configuration file.

As AI agents move from simple chat interfaces to integrated tools with local system access, the security boundary has shifted. It is no longer enough to sanitize the model's output; the entire local development environment is now part of the attack surface.

Why Anthropic Skill Scanners Miss the .test.ts Security Gap

The Hidden Execution Path and the 13.4% Risk

The Divergence Between Agent and Developer Execution Surfaces

Related Articles