The anxiety in the enterprise AI space has shifted. It is no longer about whether a model can reason through a complex query or maintain a coherent conversation over a long window. Instead, the conversation among CISOs and platform engineers has turned toward the keys to the kingdom. For months, the primary barrier to deploying autonomous agents within production environments has not been a lack of intelligence, but a fundamental flaw in how credentials are handled. In the traditional deployment model, an agent carries its authentication tokens like a physical key ring, presenting them whenever it calls a tool or queries a database. This creates a catastrophic single point of failure: if the agent is compromised via prompt injection or suffers a logic collapse, the keys to the internal system are leaked along with it.

The Technical Specifications of Sandboxes and MCP Tunnels

Anthropic is attempting to solve this structural vulnerability by moving credential control away from the agent and pushing it to the network boundary. The company has introduced two primary mechanisms to achieve this: self-hosted sandboxes and MCP tunnels. The self-hosted sandboxes, currently available in public beta, allow enterprises to keep the actual execution of tools within their own infrastructure boundaries. This creates a physical and logical separation between the agentic loop and the execution environment. While the orchestration, context management, and error recovery—the cognitive functions of the agent—continue to run on Anthropic infrastructure, the computing resources where the tools actually operate are controlled entirely by the customer.

Complementing this is the introduction of MCP tunnels, which are currently in a research preview phase. These tunnels leverage the Model Context Protocol (MCP) to create a secure path to private MCP servers without ever exposing credentials within the agent's context. The architecture employs an outbound-only gateway within the organizational network. By blocking all inbound paths and managing only the outgoing signals, the system ensures that credentials never pass through the agent itself. Instead, the authentication is handled at the network edge. Even if an agent's control loop is corrupted, the attacker cannot extract the master keys because the agent never possessed them in the first place.

This design fundamentally alters the workflow of agent deployment. The sandbox determines where the tool is executed and which resources it can touch, while the MCP tunnel defines the secure route to the internal system. For development teams, this means the process of granting permissions is no longer about assigning roles to an AI object, but about defining allowed traffic and calls at the network perimeter. Current implementation strategies suggest that teams first migrate their tool execution to self-hosted sandboxes to test boundary security before gradually expanding their connectivity via the research preview of MCP tunnels.

The Architectural Pivot Toward Loop-Execution Separation

To understand why this matters, one must look at the diverging paths between Anthropic and its primary competitors. In April, OpenAI added local execution capabilities to its Agents SDK, focusing on a sandbox approach that brings the execution environment closer to the user. Anthropic has taken a different route by physically decoupling the orchestration loop from the execution environment. In the eyes of the developer community, this is the difference between moving a safe into a different room and removing the key from the lock entirely. The OpenAI approach focuses on where the code runs, but the Anthropic approach focuses on who holds the authority.

This separation creates a brain-and-muscle dynamic. The brain, which handles the high-level reasoning and decision-making, resides in the cloud. The muscle, which performs the actual API calls and database writes, resides within the corporate firewall. By splitting these two functions across different network zones, Anthropic is redefining the threat model for enterprise AI. In a standard sandbox, a compromised agent might still be able to abuse the tokens it holds to move laterally through a system. In a loop-execution separated architecture, the agent is essentially a blind operator sending requests to a gatekeeper who holds the keys. The gatekeeper validates the request against network policies before executing the action.

For industries with extreme regulatory requirements, such as finance or healthcare, this shift lowers the psychological and technical barrier to adoption. The risk is no longer tied to the unpredictability of a large language model's behavior, but to the predictability of network security policies. Engineers are now treating agent deployment as an infrastructure problem rather than a prompt engineering problem. According to docs.anthropic.com, this separation allows for a more effective mapping of agent workflows to existing security protocols, ensuring that the AI operates within a predefined safety envelope that the company's security team can audit using traditional tools.

This transition marks a move away from the era of trusting the agent with the key, replacing it with a regime where the network itself is the ultimate gatekeeper.