You’re mixing up HBM and SRAM - which is an understandable confusion.
NVIDIA chips use HBM (High Bandwidth Memory) which is a form of DRAM - each bit is stored using a capacitor that has to be read and refreshed.
Most chips have caches on them built out of SRAM - a feedback loop of transistors that store each bit.
The big differences are in access time, power and density: SRAM is ~100 times faster than DRAM but DRAM uses much less power per gigabyte, and DRAM chips are much smaller per gigabyte of stored data.
Most processors have a few MB of SRAM as caches. Cerebras is kind of insane in that they’ve built one massive wafer-scale chip with a comparative ocean of SRAM (44GB).
In theory that gives them a big performance advantage over HBM-based chips.
As with any chip design though, it really isn’t that simple.
Thanks, TIL.
So what you’re saying is that Cerebras chips offer 44GB of what is comparable to L1 caches, while NVidia is offering 80GB of what is comparable to “fast DRAM” ?