Arguably DRAM-based GPUs/TPUs are quite inefficient for inference compared to SRAM-based Groq&#...

wmf • today at 8:31 AM • 0 replies • view on HN

Arguably DRAM-based GPUs/TPUs are quite inefficient for inference compared to SRAM-based Groq/Cerebras. GPUs are highly optimized but they still lose to different architectures that are better suited for inference.

alt Hacker News