Medical administrative staff still spend thousands of hours every year trapped in a cycle of digital transcription. The scene is familiar across healthcare providers: a staff member opens a scanned PDF of a CMS-1500 claim form on one screen and manually types those details into a billing system on another. Despite the industry's push toward digital transformation, the last mile of data entry remains a fragile bridge of human copy-pasting. This manual dependency creates a dangerous bottleneck where a single typo in a patient ID or a misread digit in a billing code triggers a cascade of claim denials and payment delays, forcing expensive human audits to correct errors that should never have existed.
The Architecture of the Bedrock Medical Pipeline
AWS has addressed this systemic inefficiency by deploying an end-to-end automation pipeline that integrates Amazon Bedrock Data Automation and Amazon Bedrock AgentCore. The workflow begins at the ingestion layer, where healthcare providers upload PDF-based CMS-1500 claim forms into an Amazon S3 bucket. This upload serves as an event trigger for AWS Lambda, which orchestrates the subsequent stages of the pipeline. The heavy lifting of data extraction is handled by Amazon Bedrock Data Automation, which employs a combination of Optical Character Recognition (OCR), machine learning models, and generative AI to parse complex text and tabular data from the documents.
To ensure the output is predictable and structured, the system utilizes Blueprints. These are configuration files that define exactly which data points need to be extracted and how they should be formatted. By applying these predefined templates or custom configurations, the pipeline transforms the visual chaos of a PDF into a structured JSON object. Crucially, this JSON output is not just raw text; it includes confidence scores for every extracted field and bounding box coordinates that pinpoint exactly where the data was found on the original page. This provides a mathematical audit trail for the accuracy of the extraction.
Once the data is extracted, it moves to AgentCore, which hosts a specialized agent known as Strands. This agent is tasked with converting the raw JSON data into FHIR (Fast Healthcare Interoperability Resources) standards, the global benchmark for healthcare data exchange. To interact with the backend storage, the agent utilizes two specific tools: `create_fhir_claim` and `search_fhir_resources`. These tools allow the agent to communicate directly with AWS HealthLake, ensuring that the billing data is not just stored, but is interoperable with other healthcare systems. The agent follows a rigorous search logic: it first checks previous tool call history to identify the insured party, and if no match is found, it performs up to two additional searches using alternative parameters from the JSON payload, prioritizing attributes with the highest confidence scores.
Shifting from Runtime Inference to Design-Time Determinism
While many AI agent frameworks rely on the Model Context Protocol (MCP) to decide tool usage at runtime, AWS took a different architectural path to ensure reliability and cost-efficiency. Instead of letting the model infer the sequence of tool calls during execution—which can lead to unpredictable behavior and increased token costs—AWS implemented Kiro. Kiro is an agentic IDE that converts natural language specifications into deterministic API call code during the design phase.
By using Kiro to generate the Lambda internal code for Bedrock Data Automation API calls and agent tools, AWS effectively shifted the intelligence from the runtime to the design time. This reduction in exploratory prompting during execution significantly lowers the cost of Bedrock calls and shortens the development cycle. The result is a system that behaves predictably, removing the "hallucination risk" associated with agents that try to figure out their workflow on the fly. This deterministic approach is reinforced by a supervisory layer where AWS Lambda acts as the final arbiter. After the agent reports the creation of a FHIR resource in HealthLake, Lambda verifies the success of the operation. Any documents that fail this verification are routed to a Dead Letter Queue (DLQ), ensuring that no claim is ever lost in the system.
This robustness is best demonstrated in the system's ability to handle real-world data discrepancies. In a recent test case, a claim form contained a typo where a claim ID was written as 11-2234-1019O (using the letter O) while the database stored it as 11-2234-10190 (using the number 0). A standard OCR pipeline would have failed the match and stopped. However, the Bedrock agent recognized the search failure and autonomously pivoted to a name-based search. This allowed the system to successfully process the claim for patient John Doe, accurately capturing a diagnosis of Back Pain M54.9, a birth date of 1960-10-10, and a total amount of 660 dollars across CPT codes 97810, 73521, 98940, and 97124.
For scenarios where data is truly missing—such as a missing insurance policy—the system avoids silent failure. It utilizes Amazon SNS (Simple Notification Service) to send a natural language alert to a human operator. For example, when a test file `sample1_cms-1500-P.pdf` was uploaded to the `/input` folder without necessary reference resources, the system generated a specific alert: "Insurance enrollment information for policy number G4683A and AnyHealth Plus Medicare plan could not be found; please verify with the insurance provider." This creates a human-in-the-loop feedback system where AI handles the volume and humans handle the exceptions.
The entire environment is deployed via the AWS CDK (Cloud Development Kit), allowing for rapid iteration and teardown. Developers can load sample data into the system using a dedicated script based on the HealthLakeDatastoreArn.
cdk deploypython load_sampledata.py <data_store_id>cdk destroyThis transition from probabilistic agent behavior to a deterministic, supervised pipeline marks a critical shift in how generative AI is applied to high-stakes industries like healthcare, where the cost of a mistake is measured not just in dollars, but in patient care delays.




