Interesting read! One remark though: I'm not too familiar with the architecture of a Google TPU...

ColonelPhantom • today at 12:46 AM • 0 replies • view on HN

Interesting read! One remark though: I'm not too familiar with the architecture of a Google TPU, but comparing the TPU's VMEM with Nvidia's shared memory feels wrong to me.

Looking at the size, and its shared nature, it feels far more natural to compare with the L2 cache, which is also shared across the entire GPU and is in the same order of size (40MB on the listed A100).

alt Hacker News