The bottleneck in training and inference isn’t matmul, and once a chip isn’t a kindergarten toy you ...

bri3d • yesterday at 10:20 PM • 1 reply • view on HN

The bottleneck in training and inference isn’t matmul, and once a chip isn’t a kindergarten toy you don’t go from FPGA to tape out by clicking a button. For local memory he’s going to have to learn to either stack DRAM (not “3000 lines of verilog” and requires a supply chain which openai just destroyed) or diffuse block RAM / SRAM like Groq which is astronomically expensive bit for bit and torpedoes yields, compounding the issue. Then comes interconnect.

Replies

piskov • yesterday at 11:15 PM

The main point is that it will not be an nvidia’s monopoly for too long.

alt Hacker News

Replies