NVIDIAJune 12, 20262 sources

NVIDIA's Blackwell Ultra NVL72 leads first agentic AI benchmark, AgentPerf

AI Analysis

Artificial Analysis introduced AgentPerf, billed as the industry's first benchmark for agentic AI infrastructure, giving developers and enterprises a standardized way to compare systems running agent workloads rather than single-shot inference. In the inaugural round, NVIDIA's Blackwell Ultra NVL72 platform posted leading results, including running 20x more agents per megawatt and topping agentic coding performance.

The benchmark matters because agents have fundamentally changed inference economics: instead of one prompt-response, an agent issues many chained, tool-using calls, multiplying compute and energy demands. A per-megawatt efficiency metric speaks directly to the cost crunch developers are voicing this week, where token economics and 'tokenmaxxing' dominate the conversation. NVIDIA also published guidance on deploying long-context reasoning and MiniMax M3 agentic workflows on its infrastructure.

As the vendor whose hardware is being measured, NVIDIA's leading position on a benchmark it heavily promotes invites natural skepticism — independent verification and broad participation from competing accelerators (AMD, Google TPU, custom silicon) will determine AgentPerf's credibility. Still, a shared standard for agentic efficiency is a meaningful step as enterprises move from pilots to production agent fleets and need apples-to-apples comparisons. Watch which rival platforms submit results and whether AgentPerf becomes an accepted industry yardstick or stays an NVIDIA showcase.

Sources

blogs.nvidia.com

https://blogs.nvidia.com/blog/nvidia-blackwell-agentperf-artificial-analysis/

developer.nvidia.com

https://developer.nvidia.com/blog/nvidia-achieves-leading-agentic-coding-performance-on-first-agentic-ai-benchmark/