I know what you’re probably thinking. Nvidia (NASDAQ: NVDA) is the most valuable company in the world — even bigger than Apple (NASDAQ: AAPL). Why would it need to “become the next Apple”?
Nvidia just had its annual GTC conference. And at that conference, management talked extensively about the boom in artificial intelligence (AI) inferencing and Nvidia’s growing ecosystem that expands beyond graphics processing units (GPUs) to capture more than one-time hardware sales.
Will AI create the world’s first trillionaire? Our team just released a report on the one little-known company, called an “Indispensable Monopoly” providing the critical technology Nvidia and Intel both need. Continue »
Just as Apple built an ecosystem of consumer products that have become as essential to many households as laundry detergent and toothpaste, so too is Nvidia building an ecosystem primarily for enterprises to create a recurring revenue stream in the age of AI inferencing.
Here’s why this evolving business model could be a game changer for investors by adding balance to Nvidia’s investment thesis.
Nvidia’s earnings growth has exploded in recent years as key hyperscaler customers build data centers that rely on Nvidia GPUs. The data center business is so massive that other segments like professional visualization, gaming, automotive, and robotics barely move the needle. In fiscal 2026, the data center segment made up just under 90% of total revenue. And that puts pressure on Nvidia to continue selling GPUs to hyperscalers to maintain its breakneck growth rate.
Nvidia’s latest architecture, Rubin, already addresses part of the problem. It includes six chips that work together to improve efficiency at rack scale for data center applications. Many of Rubin’s breakthroughs are related to AI inference rather than training.
Think of the AI model as the knowledge base that AI agents and tools use to do real-world work. Applying AI models requires immense compute for inference.
The tokenization of inferencing creates a recurring revenue stream for Nvidia. The idea is that hyperscalers will charge customers based on the number of AI inference tokens used. As AI usage for generative AI, AI agents, and physical AI grows, so will the number of tokens demanded. Nvidia’s hardware and software are built to process tokens faster, which will appeal to hyperscalers.
All told, Nvidia’s goal is to create an ecosystem that includes its purpose-built AI chips, networking hardware, and inferencing software that will scale in lockstep with token demand — as inference and physical AI will demand far more tokens than simple chat-based generative AI.
