HBM2 (High Bandwidth Memory) has been in-market for a few years now, but the various companies working on HBM3 have kept its specifications close to the chest. Technologies like HBM2E have extended baseline HBM2 performance while new capabilities, like Samsung’s Processor-in-Memory, have expanded what HBM2 is capable of. We now know a bit more about HBM3, courtesy of a new Rambus announcement regarding its own memory interface subsystem products.
According to Rambus, early HBM3 hardware should be capable of ~1.4x more bandwidth than current HBM2E. As the standard improves, that figure will rise to ~1.075TB/s of memory bandwidth per stack, with maximum I/O transfer rates of up to 8.4Gbps. These figures are per stack and many GPUs use HBM with 2-4 stacks, so total bandwidth provided by a four-stack HBM3 solution at 665GB/s is ~2.7TB/s or a heck of a lot more than anything available currently.
HBM occupies an odd position in the GPU hierarchy. We expect to see more high-end CPU and GPU products carry the high-end memory standard in the future, but AMD and Nvidia have both moved to GDDR6/GDDR6X compared with HBM, possibly to reduce costs. HBM and HBM2 have power advantages over conventional VRAM, but HBM sits on-package and connects to the CPU via 1024-bit links via a 2.5D interposer layer. This has always saddled the standard with higher costs that somewhat offset its significant bandwidth and performance advantages.
HBM3 might be more acceptable as a GPU solution now that the bandwidth is above 1TB/sec per stack and density has hit a reported 16GB. HBM3 will support up to 64GB RAM stacks in a 16-Hi configuration at 4GB per layer. At that kind of density and bandwidth, a single HBM3 connection would provide roughly 1.15x more bandwidth than the current RTX 3090. HBM3 isn’t expected in-market before the end of 2022 or early 2023, so we’d expect highest-end GPUs to leapfrog this metric by the time the standard is available, but HBM is typically deployed in up to four stacks and a high-end GPU still wouldn’t require more than two to be competitive.
Our assumption is that GDDR will continue to dominate GPU memory — AMD’s experimentation with it looks more like a swerve than a VRAM standard switch at this point — but HBM3’s new capabilities are broad enough that we might see the technology used more widely in consumer products in the future.
One possibility is that we might see HBM come to desktop chips one day instead of GPUs. This year, AMD is adding huge L3 caches to its processors. Next year, both AMD and Intel will begin shipping HBM-equipped server processors in the form of Genoa (AMD) and Sapphire Rapids (Intel). Technologies that debut in the server space often make their way to desktops over time. I wouldn’t stick a timeline on the idea, but it’s not impossible.
Feature image shows Nvidia’s Volta, a GPU built with HBM2.
Now Read: