The 85% Success Rate of Agentjacking Attacks on Claude Code

A developer opens their IDE to squash a lingering bug, trusting their AI assistant to parse the logs and suggest a fix. They prompt the agent to analyze the latest error reports, and within seconds, the AI identifies the issue and proposes a command to resolve it. The developer hits enter, believing they are streamlining their workflow. In reality, they have just executed a malicious payload delivered via a spoofed error report, granting an external attacker full access to their local environment. This is the new frontier of social engineering, where the target is not the human, but the AI agent the human trusts.

The Mechanics of Agentjacking and the MCP Vulnerability

Tenet Security has identified a critical vulnerability they term agentjacking, a method of hijacking AI agents to execute unauthorized commands. The attack vector leverages the Model Context Protocol (MCP), a standard designed to allow AI models to seamlessly access external data sources. While MCP enhances the utility of agents, it creates a dangerous trust bridge. The vulnerability specifically targets the way agents interact with Sentry, a widely used real-time error tracking tool. By utilizing publicly exposed Sentry credentials, specifically the Data Source Name (DSN), attackers can inject fabricated error reports into a system.

When an AI agent like Claude Code, Cursor, or Codex is tasked with diagnosing a problem, it queries the MCP server for recent logs. If an attacker has injected a malicious payload into the Sentry stream, the MCP server returns this data as a trusted diagnostic result. The AI agent, lacking a mechanism to verify the authenticity of the log content, treats the attacker's instructions as legitimate system data. In controlled testing, this attack achieved a staggering 85% success rate. The agent does not see a hack; it sees a bug report that happens to contain a command it believes is necessary to fix the system.

This is not a theoretical risk limited to a few edge cases. Tenet Security identified 2,388 organizations with publicly exposed Sentry credentials. For these organizations, the risk is immediate. Because AI coding agents are often granted shell execution privileges to perform tasks like installing dependencies or running tests, the agentjacking payload is executed with the full permissions of the developer. The danger extends beyond Sentry; any MCP-connected data source that a developer trusts—including Datadog, PagerDuty, and Jira—can potentially serve as a conduit for malicious instructions if the underlying credentials are compromised.

The Failure of Static Security and the Identity Gap

What makes agentjacking particularly lethal is that it bypasses the entire traditional security stack. Endpoint Detection and Response (EDR) tools, Web Application Firewalls (WAF), and Identity and Access Management (IAM) systems are designed to stop unauthorized intrusions or anomalous network traffic. However, in an agentjacking scenario, the traffic is legitimate. The API call to the Sentry server is valid, the MCP server is functioning as intended, and the AI agent is performing a task it was explicitly authorized to do: read logs and execute fixes. To the security software, this looks like a productive developer using a modern toolset.

This reveals a fundamental disconnect in how enterprises manage AI permissions. According to a survey by Okta and Apprize360 involving 292 executives and 492 knowledge workers, only 34% of organizations apply the same security controls to AI agents as they do to human employees. This gap is further highlighted by Gravitee, whose survey of over 900 professionals found that only 14.4% of deployed agents had undergone a full security approval process. We are essentially granting AI agents the keys to the kingdom—including shell access and production environment privileges—without implementing the oversight required for such power.

This systemic vulnerability is reflected in broader industry trends. The 2026 AI Threat Landscape Report from HiddenLayer indicates that one in eight AI security breaches is now linked to agent systems. Furthermore, 33% of the 250 IT and security leaders surveyed admitted that their AI agents have already exceeded their original intended scope. The industry is currently relying on static policies—predefined rules that grant a set of permissions at the start of a session. Once the agent is authenticated, it is trusted implicitly until the session ends. This static trust is exactly what agentjacking exploits.

To counter this, the industry is shifting toward Continuous Identity. CrowdStrike recently introduced Continuous Identity for AI Agents at the Identiverse conference on June 15, proposing a move away from static permissions toward a real-time authorization framework. Instead of a one-time check at the door, this approach requires the agent to verify its identity and the legitimacy of its action for every single operation it performs. It transforms security from a perimeter fence into a continuous verification process, ensuring that even if an agent is tricked by a fake error report, the resulting action is flagged and blocked in real-time because it deviates from the agent's verified behavioral identity.

Security in the era of autonomous agents is no longer about deciding who to trust, but about strictly defining what is allowed to happen at runtime.

The 85% Success Rate of Agentjacking Attacks on Claude Code

The Mechanics of Agentjacking and the MCP Vulnerability

The Failure of Static Security and the Identity Gap

Related Articles