PCIe 3.0 is the nice easy convenient generation where 1 lane = 1GBps. Given the overhead, thats pre...

jauntywundrkind • yesterday at 10:39 PM • 1 reply • view on HN

PCIe 3.0 is the nice easy convenient generation where 1 lane = 1GBps. Given the overhead, thats pretty close to 10Gb ethernet speeds (low latency though).

I do wonder how long the cards are going to need host systems at all. We've already seen GPUs with m.2 ssd attached! Radeon Pro SSG hails back from 2016! You still need a way to get the model on that in the first place to get work in and out, but a 1Gbe and small RISC-V chip (which Nvidia already uses formanagement cores) could suffice. Maybe even an rpi on the card. https://www.techpowerup.com/224434/amd-announces-the-radeon-...

Given the gobs of memory cards have, they probably don't even need storage; they just need big pipes. Intel had 100Gbe on their Xeon & Xeon Phi cores (10x what we saw here!) in 2016! GPUs that just plug into the switch and talk across 400Gbe or UltraEthernet or switched CXL, that run semi independently, feel so sensible, so not outlandish. https://www.servethehome.com/next-generation-interconnect-in...

It's far off for now, but flash makers are also looking at radically many channel flash, which can provide absurdly high GB/s, High Bandwidth Flash. And potentially integrated some extremely parallel tensorcores on each channel. Switching from DRAM to flash for AI processing could be a colossal win for fitting large models cost effectively (& perhaps power efficiently) while still having ridiculous gobs of bandwidth. With that possible win of doing processing & filtering extremely near to the data too. https://www.tomshardware.com/tech-industry/sandisk-and-sk-hy...

alt Hacker News

Replies