Jensen Huang walked onto the SAP Center stage on March 16 in his leather jacket, and two hours later walked off having repositioned Nvidia from a chipmaker into a full-stack agentic AI platform company. The keynote was dense with product announcements, but the throughline was unmistakable: Nvidia thinks the next phase of AI is agents, and it plans to own every layer they run on.
That framing has uncomfortable implications for the rest of the industry. If Jensen is right that AI models are rapidly commoditizing while infrastructure becomes the real moat, then Nvidia just drew a circle around the only part of the stack that prints money long-term.
The trillion-dollar demand signal
The headline number: Nvidia now projects at least $1 trillion in revenue opportunity from Blackwell and Vera Rubin platforms through 2027. That is double the $500 billion estimate from last year's GTC. Jensen attributed this to a straightforward calculation: AI compute demand has grown roughly one million times over the past two years as reasoning models replaced retrieval-based systems and usage scaled simultaneously.
"If they could just get more capacity, they could generate more tokens, their revenues would go up," Huang told the crowd, referring to Nvidia's cloud and enterprise customers. The company reported 11 straight quarters of revenue growth above 55%, and the current quarter is tracking at about $78 billion, up 77% year-over-year.
These numbers are large enough that they tend to make people skeptical. But the demand signal appears real. AWS, Microsoft Azure, Google Cloud, Oracle, and CoreWeave are all expanding their Nvidia deployments. Meta, ByteDance, and Alibaba are building at similar scale. The constraint right now is not demand but power delivery.
Vera Rubin: the inference machine
The hardware centerpiece was the Vera Rubin platform, which is now in production. It is not a single chip but a full-stack system with seven chip types assembled into five rack-scale designs: Vera CPUs, Rubin GPUs, NVLink 6 switches, ConnectX-9 NICs, BlueField-4 DPUs, Spectrum-X optical NICs, and Groq 3 LPUs. Combined specs: 3.6 exaflops and 260 terabytes per second of NVLink bandwidth.
The Groq integration matters here. Nvidia acquired Groq for $20 billion in late 2025, and the Groq 3 LPX chip is purpose-built for inference decode. Nvidia's new Dynamo software layer disaggregates inference: prefill goes to the Rubin GPU, decode goes to the Groq LPU. The result, according to both Nvidia and third-party analysis from Semi Analysis, is 35x more throughput per megawatt compared to Blackwell alone.
That figure deserves scrutiny. Semi Analysis independently verified and actually exceeded Nvidia's own claims, finding roughly 50x more tokens per watt versus the Hopper H200. Jensen joked that analyst Dylan Patel "accused me of sandbagging. He was right." Whether these benchmarks hold across diverse production workloads remains to be seen, but the direction is clear: Nvidia is engineering specifically for inference throughput, not just training FLOPS.



