What a post&AI bubble world would look like for the server GPU market
What a post&AI bubble world would look like for the server GPU market
Sam Altman has been very good for Nvidia.
In 2022, server-class GPUs were a $10 billion business according to Aaron Rakers, an analyst at Wells Fargo. Not bad, but still a small category nonetheless. This year, revenues are expected to hit the $50 billion mark, and it’s all thanks to the generative AI craze spawned by ChatGPT.
Rakers estimates that if these trends persist, the server GPU market could be worth more than $100 billion by 2028—or, put in other terms, equivalent to the combined GDPs of Iceland and Lithuania. Nvidia, with a 95 percent market share of the server GPU market, and a growing focus on accelerated computing as a whole, will be the prime beneficiary of this spending spree.
But that’s a big “if.” Sure, OpenAI may be the hottest name in tech, recently closing a $6.6 billion funding round at a $157 billion valuation. Yes, hyperscalers like Microsoft and Google are spending big, hoping to provide the computing power necessary to power generative AI. And I won’t deny that generative AI features are creeping into more and more products.
But that doesn’t mean that generative AI will be a generational, industry-defining technology. What, then, happens to the server GPU market if the generative AI bubble pops?
If the market for generative AI turns out to be a transient fad, it would undoubtedly be bad news for the hyperscale cloud providers that have spent big on new data centers, servers, and GPUs. Nvidia would also, undoubtedly, suffer.
But I’m convinced that any pain would be, if not transient, then limited. Tech companies are resourceful creatures, and I believe they’ll pivot. There are countless applications for server-class GPUs beyond generative AI, and many of these have yet to be fully realized.
Parallel Potentials
Until fairly recently, GPUs were mainly used by gamers to increase frame rates and improve picture quality. This was the GPU market. Companies like Nvidia and AMD (or, going further back, 3dfx, ATI, and PowerVR) made their money by helping gamers play the latest titles, or by selling the chipsets that powered the latest consoles of the era.
GPUs became essential to gaming because they were better at parallel processing than CPUs. They could perform more simultaneous calculations than even the most expensive Intel silicon—which is incredibly useful when you’re simulating real-world physics, or trying to render thousands of pixels every millisecond.
To illustrate that point: the most capable Intel Xeon CPU has 128 cores. An Nvidia H100 GPU—the kind used to power generative AI applications—has nearly 17,000.
Think of cores like workers. A CPU might have 64 really fast workers, but for tasks that lend well to parallelization—those where you can divide the task across multiple “workers”—the GPU wins. Even if its workers are slightly slower, the fact that there are more of them means it gets the job done quicker.
Over the past 15 years, AMD and Nvidia—the two major GPU manufacturers, although Nvidia dominates the server market—have built the foundations that allow developers to use these cores, not for rendering the latest Call of Duty title, but for running general purpose applications. NVIDIA had one particular leg up—it created CUDA in 2007, an API which allowed software developers to create applications accelerated by GPUs.
And that matters because CPUs are different beasts to GPUs. You can’t just write normal code and expect it to run on a GPU with full acceleration. It needs to be delicately tuned for the underlying hardware. CUDA dramatically simplified this process, making it accessible to everyone.
As a result, NVIDIA’s transformation is especially stark. It’s no longer a gaming company, or a hardware company (although its actual manufacturing is primarily contracted out to other vendors). Nvidia is an accelerated systems company, having constructed a mature software ecosystem that allows companies to run the most demanding computational tasks on its GPUs.
Running and training AI is both computationally-intensive and involves processing large reams of data, and so it makes sense that server-class GPUs are a hot commodity. But, if you think a little further, it’s not hard to identify other potential use-cases.
And, when you consider the scale of these use-cases, it’s enough to give you hope for the future of the segment.
Looking Forward by Looking Back
Many of these use-cases are well established. The most obvious example is, of course, data analytics.
It’s hard to fathom petabyte-scale datasets. But if you’re a large government entity, running services for tens (or hundreds) of millions of people, or a large company serving a global market, it’s your reality. These organizations face a distinct challenge: more data requires more computational power to process it.
Consider a company like Amazon, for example. It has hundreds of millions of customers around the world. With every interaction, each customer generates
Sam Altman has been very good for Nvidia.
In 2022, server-class GPUs were a $10 billion business according to Aaron Rakers, an analyst at Wells Fargo. Not bad, but still a small category nonetheless. This year, revenues are expected to hit the $50 billion mark, and it’s all thanks to the generative AI craze spawned by ChatGPT.
Rakers estimates that if these trends persist, the server GPU market could be worth more than $100 billion by 2028—or, put in other terms, equivalent to the combined GDPs of Iceland and Lithuania. Nvidia, with a 95 percent market share of the server GPU market, and a growing focus on accelerated computing as a whole, will be the prime beneficiary of this spending spree.
But that’s a big “if.” Sure, OpenAI may be the hottest name in tech, recently closing a $6.6 billion funding round at a $157 billion valuation. Yes, hyperscalers like Microsoft and Google are spending big, hoping to provide the computing power necessary to power generative AI. And I won’t deny that generative AI features are creeping into more and more products.
But that doesn’t mean that generative AI will be a generational, industry-defining technology. What, then, happens to the server GPU market if the generative AI bubble pops?
If the market for generative AI turns out to be a transient fad, it would undoubtedly be bad news for the hyperscale cloud providers that have spent big on new data centers, servers, and GPUs. Nvidia would also, undoubtedly, suffer.
But I’m convinced that any pain would be, if not transient, then limited. Tech companies are resourceful creatures, and I believe they’ll pivot. There are countless applications for server-class GPUs beyond generative AI, and many of these have yet to be fully realized.
Parallel Potentials
Until fairly recently, GPUs were mainly used by gamers to increase frame rates and improve picture quality. This was the GPU market. Companies like Nvidia and AMD (or, going further back, 3dfx, ATI, and PowerVR) made their money by helping gamers play the latest titles, or by selling the chipsets that powered the latest consoles of the era.
GPUs became essential to gaming because they were better at parallel processing than CPUs. They could perform more simultaneous calculations than even the most expensive Intel silicon—which is incredibly useful when you’re simulating real-world physics, or trying to render thousands of pixels every millisecond.
To illustrate that point: the most capable Intel Xeon CPU has 128 cores. An Nvidia H100 GPU—the kind used to power generative AI applications—has nearly 17,000.
Think of cores like workers. A CPU might have 64 really fast workers, but for tasks that lend well to parallelization—those where you can divide the task across multiple “workers”—the GPU wins. Even if its workers are slightly slower, the fact that there are more of them means it gets the job done quicker.
Over the past 15 years, AMD and Nvidia—the two major GPU manufacturers, although Nvidia dominates the server market—have built the foundations that allow developers to use these cores, not for rendering the latest Call of Duty title, but for running general purpose applications. NVIDIA had one particular leg up—it created CUDA in 2007, an API which allowed software developers to create applications accelerated by GPUs.
And that matters because CPUs are different beasts to GPUs. You can’t just write normal code and expect it to run on a GPU with full acceleration. It needs to be delicately tuned for the underlying hardware. CUDA dramatically simplified this process, making it accessible to everyone.
As a result, NVIDIA’s transformation is especially stark. It’s no longer a gaming company, or a hardware company (although its actual manufacturing is primarily contracted out to other vendors). Nvidia is an accelerated systems company, having constructed a mature software ecosystem that allows companies to run the most demanding computational tasks on its GPUs.
Running and training AI is both computationally-intensive and involves processing large reams of data, and so it makes sense that server-class GPUs are a hot commodity. But, if you think a little further, it’s not hard to identify other potential use-cases.
And, when you consider the scale of these use-cases, it’s enough to give you hope for the future of the segment.
Looking Forward by Looking Back
Many of these use-cases are well established. The most obvious example is, of course, data analytics.
It’s hard to fathom petabyte-scale datasets. But if you’re a large government entity, running services for tens (or hundreds) of millions of people, or a large company serving a global market, it’s your reality. These organizations face a distinct challenge: more data requires more computational power to process it.
Consider a company like Amazon, for example. It has hundreds of millions of customers around the world. With every interaction, each customer generates