AWS raises EC2 GPU prices ~20% on July 1; Graviton5 CPUs reach general availability
The ~20% GPU price increase for EC2 Capacity Blocks for ML is the second hike in six months (following ~15% in January), and it squeezes ML compute budgets just as agentic workloads drive up token consumption. P6-B300 instances now list at $14.04/hr and P5 at $5.19/hr, with AWS attributing the increase to memory shortages and supply/demand imbalance.
Graviton5, reaching general availability the same day, is the counterweight: AWS claims 25% better performance and 5x larger cache than Graviton4, pushing CPU price-performance for workloads that can shift off GPUs. AWS also shipped tooling — Continuum, Context, and Amazon Quick agent enhancements — extending its agent-ops surface.
The community reaction to the GPU hike was frustration: 'scale meant savings, that's not true anymore,' with developers noting percentage-based enterprise discounts now stack against thinner margins. The increase fits the week's compute-economics theme — Meta leasing idle capacity, NVIDIA's revenue-share deals, and Etched's cheaper-inference pitch all respond to the same GPU-scarcity pressure. The strategic read is that AWS is nudging customers toward its own silicon (Graviton, Trainium) as third-party GPU costs climb. Watch whether the GPU price increases accelerate migration to alternative inference hardware.