Nvidia, which just earned over $10 billion in one quarter on its datacenter-oriented compute GPUs, plans to at least triple output of such products in 2024, according to the Financial Times, which cites sources with knowledge of the matter. The move is very ambitious and if Nvidia manages to pull it off and demand for its A100, H100 and other compute CPUs for artificial intelligence (AI) and high-performance computing (HPC) applications remains strong, this could mean incredible revenue for the company.
Demand for Nvidia’s flagship H100 compute GPU is so high that they are sold out well into 2024, the FT reports. The company intends to increase production of its GH100 processors by at least threefold, the business site claims, citing three individuals familiar with Nvidia’s plans. The projected H100 shipments for 2024 range between 1.5 million and 2 million, marking a significant rise from the anticipated 500,000 units this year.
Because Nvidia’s CUDA framework is tailored for AI and HPC workloads, there are hundreds of applications that only work on Nvidia’s compute GPUs. While both Amazon Web Services and Google have their own custom AI processors for AI training and inference workloads, they also have to buy boatloads of Nvidia compute GPUs as their clients want to run their applications on them.
But increasing the supply of Nvidia H100 compute GPUs, GH200 Grace Hopper supercomputing platform, and products on their base is not going to be easy. Nvidia’s GH100 is a complex processor that is rather hard to make. To triple its output, it has to get rid of several bottlenecks.
Firstly, the GH100 compute GPU is a huge piece of silicon with a size of 814 mm^2, so it’s pretty hard to make in huge volumes. Although yields of the product are likely reasonably high by now, Nvidia still needs to secure a lot of 4N wafer supply from TSMC to triple output of its GH100-based products. A rough estimate suggests TSMC and Nvidia can get at most 65 chips per 300 mm wafer.
To manufacture 2 million such chips would thus require nearly 31,000 wafers — certainly possible, but it’s a sizeable fraction of TSMC’s total 5nm-class wafer output, which is around 150,000 per month. And that capacity is currently shared between AMD CPU/GPU, Apple, Nvidia, and other companies.
Secondly, GH100 relies on HBM2E or HBM3 memory and uses TSMC’s CoWoS packaging, so Nvidia needs to secure supply on this front as well. Right now, TSMC is struggling to meet demand for CoWoS packaging.
Thirdly, because H100-based devices use HBM2E, HBM3, or HBM3E memory, Nvidia will have to get enough HBM memory packages from companies like Micron, Samsung, and SK Hynix.
Finally, Nvidia’s H100 compute cards or SXM modules have to be installed somewhere, so Nvidia will need to ensure that its partners also at least triple output of their AI servers, which is another concern.
But if Nvidia can supply all of the requisite H100 GPUs, it certainly stands to make a massive profit on the endeavor next year.