Meta's Million-CPU Deal With AWS Reshapes AI Inference

This week, the developer community is watching a deal that flips the AI hardware script. Meta has agreed to deploy millions of AWS's custom Graviton4 ARM CPUs across its AI infrastructure. Not GPUs. CPUs. And not a small pilot — a multi-million-unit commitment that signals a fundamental rethinking of how inference workloads should run.

AWS Unveiled Graviton4 in December, Meta Signed a Multi-Million-Unit Deal

AWS released its fourth-generation Graviton chip in December, purpose-built for AI inference. Meta signed a contract to acquire millions of these chips, and AWS made the deal public last Friday. Graviton4 is an ARM-based CPU, not a GPU. While GPUs dominate large-scale model training, the AI agents that run on top of trained models — performing real-time inference, code generation, search, and multi-step task orchestration — generate massive CPU-friendly workloads. AWS explicitly stated that Graviton4 was designed with these AI agent workloads in mind.

The Old GPU Monopoly Is Cracking Under CPU Inference Pressure

Not long ago, AI workloads meant GPUs were the only game in town. Nvidia's H100 and B200 GPUs owned both training and inference. That monopoly is eroding. As AI agents proliferate, the inference stage increasingly favors CPUs for efficiency. Nvidia has sensed the shift and launched its own ARM-based Vera CPU, but with a key difference: AWS offers its chips exclusively within its own cloud, while Nvidia sells chips directly to enterprises and cloud providers (including itself). AWS CEO Andy Jassy, in his shareholder letter last week, took direct aim at Nvidia and Intel, stating that "enterprises want better price-performance from AI."

The Real Change Developers Will Feel Is a Diversified Chip Landscape

This deal could redirect Meta's cloud spending back to AWS after the company signed a six-year, $10 billion agreement with Google Cloud last August. Meta has historically been an AWS customer but had locked in a massive Google contract. AWS timed this announcement to land immediately after Google Cloud Next, tightening the competitive pressure. Meanwhile, AWS also owns its own AI GPU, Trainium, but last month Anthropic signed a 10-year, $100 billion deal to reserve Trainium capacity, leaving AWS to pivot toward CPUs for broader availability. For developers, the takeaway is clear: when architecting AI inference infrastructure, ARM CPUs are no longer an afterthought — they are a primary option alongside GPUs.

Whether Meta's bet becomes the starting gun for CPU-based AI inference or remains a tactical deal will define the next phase of infrastructure competition.

Meta's Million-CPU Deal With AWS Reshapes AI Inference

AWS Unveiled Graviton4 in December, Meta Signed a Multi-Million-Unit Deal

The Old GPU Monopoly Is Cracking Under CPU Inference Pressure

The Real Change Developers Will Feel Is a Diversified Chip Landscape

Related Articles