Amazon Bedrock AgentCore: Scaling AI SaaS with IAM-Level Isolation

SaaS developers have long been trapped in a binary struggle between operational efficiency and absolute security. On one side lies the silo model, where every customer gets a dedicated server, ensuring total isolation but driving infrastructure costs to unsustainable levels. On the other side is the shared model, which maximizes resource utilization but introduces the terrifying risk of a single coding error in a WHERE clause leaking one client's sensitive data to another. For AI agents handling healthcare or financial records, this tension is not just a technical hurdle but a regulatory liability. The industry has been searching for a middle ground where infrastructure is shared, but data remains logically and physically impenetrable.

The Three-Tier Architecture of the Pool Model

Amazon Bedrock AgentCore addresses this dilemma through a pool model multitenancy architecture organized into a strict three-tier hierarchy: Tier, Tenant, and User. In this model, tenants do not receive dedicated hardware; instead, they share a common pool of computing resources. Isolation is achieved not through physical separation, but through scoped identifiers, rigorous access policies, and strategic data partitioning. This approach allows operators to maximize resource density while maintaining the security posture of a siloed environment.

To visualize this in a real-world scenario, consider a healthcare AI agent serving multiple medical groups. The top level, the Tier, defines the service quality and feature set based on the subscription plan, such as Basic or Premium. Below the Tier is the Tenant, representing an individual clinic or hospital. At the base of the hierarchy is the User, the specific doctor or administrator within that clinic. This hierarchy serves as the primary isolation boundary for every interaction, from knowledge base document retrieval and session memory to model access permissions and cost tracking. The full implementation details and configuration for this pattern are available in the sample-agentcore-and-multitenancy-blog repository.

Computing isolation is handled at the runtime level. When an agent session begins, Amazon Bedrock AgentCore assigns an independent micro VM to the execution environment. This ensures that code from different tenants never shares the same memory space, effectively eliminating the risk of cross-tenant memory leaks. Furthermore, the system hosts separate agent instances based on the service tier, ensuring that a heavy workload from a Basic tier tenant cannot degrade the performance of a Premium tier tenant.

Identity verification relies on a JSON Web Token (JWT) model powered by Cognito ID tokens. These tokens are validated at two critical boundaries: the runtime and the gateway. The JWT contains custom claims, such as `tenant_id` and `tier`, which travel with the request to provide essential metadata. During the agent deployment phase, the JWT authorizer is configured to validate the token before any agent code is ever executed:

{

"authorizer": {

"type": "JWT",

"config": {

"issuer": "https://cognito-idp.region.amazonaws.com/user-pool-id",

"audience": ["client-id"]

}

Once validated, the AgentCore Gateway transforms this identity into a concrete execution context. It generates specific headers—`X-Tier`, `X-Clinic-ID`, and `X-S3-Prefix`—and propagates them to downstream systems. When the agent calls the gateway using the original JWT as a Bearer token, the gateway verifies it and passes the tenant headers to the target Lambda function via the `metadataConfiguration`. Because the target Lambda only trusts headers that have passed through the CUSTOM_JWT authorizer, developers no longer need to implement redundant authentication logic within every individual tool.

Shifting Security from Application Code to Cloud Fabric

Most SaaS applications rely on application-level filtering, where the developer is responsible for adding tenant IDs to every database query. This is a fragile strategy; a single forgotten filter in a complex join can lead to a catastrophic data breach. Amazon Bedrock AgentCore moves this responsibility from the developer to the infrastructure. It utilizes a hierarchical namespace structure and a composite `actor_id` within AgentCore Memory to physically separate conversation histories by tenant and user.

To enforce this at the infrastructure level, the system employs the Token Vending Machine (TVM) pattern combined with Attribute-Based Access Control (ABAC). Instead of using a broad, static role, the agent is granted a TVM role at runtime. This role is tagged with three session attributes: Tier, ClinicId, and UserId. The agent then uses these tags to obtain temporary, short-lived credentials that are mathematically tied to the user's attributes. In this paradigm, the user's attributes themselves become the key to the data.

This is most evident in the integration with DynamoDB. By using the `dynamodb:LeadingKeys` condition in IAM policies, the system ensures that a request is only permitted if the requester's session tag matches the partition key of the data being accessed. Even if a developer writes a malformed query that attempts to access another clinic's data, the AWS IAM layer will block the request before it ever reaches the database engine. The trust policy for the TVM role is strictly limited to the Agent Execution Role and requires all three session tags to be present before issuing credentials:

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"AWS": "arn:aws:iam::ACCOUNT_ID:role/AgentExecutionRole"

"Action": "sts:AssumeRole",

"Condition": {

"StringEquals": {

"sts:TagSession": [

"Tier",

"ClinicId",

"UserId"

]

}

]

}

Beyond data isolation, the system manages service-tier differentiation through the Model Context Protocol (MCP) and the Cedar policy engine. The AgentCore Gateway uses MCP to transform static Lambda functions into dynamic tools, removing the need for custom orchestration logic to handle API parsing or tenant context propagation. For example, tools like `get_patient_records` and `patient_context` can be deployed and managed as standardized MCP tools.

Access to these tools is governed by the AgentCore Policy engine, which uses the Cedar policy language. This engine operates in ENFORCE mode, validating every request in real-time. This allows operators to change feature access for different tiers by simply updating a policy file rather than rewriting code. In a practical application, a Basic tier user might be restricted from using the `patient_context` tool outside of business hours. When the agent calls the `current_time` function, the Cedar engine evaluates the `request_hour` field and immediately rejects any request made outside the 08:00 to 18:00 window.

For B2B AI SaaS practitioners, this architecture represents a fundamental shift in how security is conceptualized. By moving the burden of isolation from the application layer to the IAM layer, the risk of human error is drastically reduced. The combination of the pool model for cost efficiency, ABAC for ironclad isolation, and MCP for rapid tool integration allows teams to scale their AI offerings without compromising on security or operational overhead.

The era of security-by-coding is ending, replaced by a world where data isolation is an immutable property of the infrastructure itself.

Amazon Bedrock AgentCore: Scaling AI SaaS with IAM-Level Isolation

The Three-Tier Architecture of the Pool Model

Shifting Security from Application Code to Cloud Fabric

Related Articles