That’s kind of a moot point. Even if none of those overheads existed you would still be getting a a ...

djsjajah • today at 1:01 AM • 0 replies • view on HN

That’s kind of a moot point. Even if none of those overheads existed you would still be getting a a fractions of the mfu. Models are fundamental limited by memory bandwidth even with best case scenarios of sft or prefill.

And what are you doing that I/O is a bottleneck?

alt Hacker News