The Claude AI Pipeline That Found $500,000 in Google API Flaws

The modern bug bounty hunter no longer spends weeks manually probing a single endpoint with a proxy and a prayer. Instead, the industry is shifting toward a model of automated harvesting, where the primary skill is not just finding a bug, but building a pipeline that can find thousands of them. This week, the security community is dissecting a masterclass in this approach: a researcher who leveraged Claude AI to systematically dismantle Google's API security surface, turning a three-month experiment into a $500,000 payday.

The Architecture of Automated Discovery

The scale of the operation was designed to overwhelm traditional manual auditing. The researcher began by aggregating a massive dataset of potential targets, collecting 61,200 Android APK files, decrypting iOS IPA files, and utilizing Chrome debugger APIs to extract API keys. By combining these keys with Google's own discovery documents—the machine-readable specifications of their APIs—the researcher identified over 1,500 distinct APIs to test. A critical breakthrough occurred when the researcher discovered that adding a specific parameter, `?labels=GOOGLE_INTERNAL`, to the requests exposed internal endpoint specifications that were intended to be hidden from the public. This effectively expanded the attack surface to include the very internal tools Google developers use.

To navigate the authentication layer, the researcher had to solve Google's First Party Authentication (FPA). By analyzing leaked sourcemaps, they located the `gapix` library, which contained the logic for generating FPA v2 headers. The authentication token followed a specific SHA1 hashing structure: `email:gaiaId timestamp sessionCookie origin`. By manipulating the Gaia ID (Google Accounts and ID Administration) and observing how obfuscation affected access rights, the researcher created a reliable way to bypass authentication checks across various services.

Once the access path was clear, the researcher integrated Claude AI via the Model Context Protocol (MCP). To prevent the AI from prematurely concluding that an endpoint was secure—a common failure mode in LLM-based testing—the researcher implemented what they called a Ralph Wiggum loop. This logic forced the AI to test every single identified endpoint at least once before the session could terminate. To optimize the AI's performance, the researcher simplified the `probe_api` input to focus strictly on the endpoint, path, and request body, while offloading the complex FPA authentication handling to the backend. This allowed Claude to focus entirely on crafting malicious payloads and identifying logic flaws.

The financial results of this pipeline were staggering. In less than 90 days, the researcher secured several high-value bounties. Google Voice was found to allow the dumping of personally identifiable information (PII) and recovery phone numbers via unauthenticated requests, earning $20,000. Vertex AI Search for Commerce allowed the reading and modification of customer system prompts (preambles) without access control, resulting in a $30,000 reward. The Widevine DRM partner portal API exposed organization lists and PGP/AES keys, yielding $16,004.40. Even more critical was a flaw in PLX, an internal analytics platform, where a `setIamPolicy` call on a staging API allowed the dumping of up to 2.1PB of confidential YouTube data, totaling $24,000 across multiple reports. Finally, an IAM validation failure in App Engine allowed the retrieval of 24-hour logs from arbitrary projects, which was assigned CVE-2026-8934 and paid out $18,000.

The Illusion of Sophistication

While the use of Claude and MCP makes this look like a futuristic AI attack, the actual vulnerabilities discovered reveal a more mundane and troubling reality. There was no complex heap overflow or sophisticated zero-day exploit involved. Instead, the researcher found a recurring pattern of basic administrative failures: missing IAM checks, unauthenticated GraphQL endpoints, and staging environments that were accidentally connected to production databases.

This creates a sharp contrast between the perceived complexity of AI-driven hacking and the simplicity of the errors being exploited. The AI did not invent a new way to break encryption; it simply performed the tedious task of checking 1,500 endpoints for the same basic mistake that a human would have missed after the first fifty. The real danger here is not the AI's intelligence, but its persistence. When a company uses a standardized infrastructure, a single pattern of neglect—such as forgetting to implement an authorization check on a specific API group—tends to be replicated across dozens of different services. AI is the perfect tool for this kind of pattern-based exhaustion.

For security teams, this highlights three critical operational risks. First is the failure of environment isolation. When a staging API points to a production database, any weakness in the staging environment's looser security becomes a direct pipeline to production data. Second is the danger of machine-readable specifications. Discovery docs, GraphQL SDLs, and proto files are designed to help developers, but in the hands of an AI, they serve as a perfect map for precision fuzzing. Third is the fundamental shift in the role of AI in cybersecurity. AI is no longer just a tool for writing reports or summarizing logs; it is now a high-efficiency verification engine capable of auditing an entire corporate attack surface in a fraction of the time it takes a human team.

To combat this, the researcher implemented an operation ID reproduction system to eliminate false positives. Every bug report generated by the AI included a unique ID that linked to the exact request used. This allowed the researcher to verify the bug with a single click on a frontend dashboard, ensuring that the evidence was immutable and the reproduction was instantaneous.

This shift toward AI-automated auditing means that the window for fixing basic configuration errors has closed. In an era where an LLM can map and probe thousands of endpoints per hour, the only viable defense is a zero-trust architecture where authentication is not a perimeter check, but a mandatory requirement for every single function call.

The Claude AI Pipeline That Found $500,000 in Google API Flaws

The Architecture of Automated Discovery

The Illusion of Sophistication

Related Articles