Isn't this just saying that your GPU use is bottlenecked by things such as VRAM bandwidth and R...

zozbot234 • yesterday at 4:01 PM • 2 replies • view on HN

Isn't this just saying that your GPU use is bottlenecked by things such as VRAM bandwidth and RAM-VRAM transfers? That's normal and expected.

Replies

spmurrayzzz • yesterday at 7:52 PM

No I'm saying there are quite a few more I/O bottlenecks than that. Even in the more efficient training frameworks, there's per-op dispatch overhead in python itself. All the boxing/unboxing of python objects to C++ handles, dispatcher lookup + setup, all the autograd bookkeeping, etc.

All of the bottlenecks in sum is why you'd never get to 100% MFUs (but I was conceding you probably don't need to in order to get value)

alt Hacker News

Replies