logoalt Hacker News

redox99last Tuesday at 12:54 AM1 replyview on HN

And I expect blackwells to hold value even more (already very LLM optimized, and semiconductor processes will slow down).


Replies

joefourierlast Tuesday at 11:45 AM

Yeah most of the performance increases have mostly been from architectural improvements like reduced precision tensor cores. AFAIK FP4 is basically the limit for floating point matmuls, after which you need to switch to integer addition if you want to reduce bits, and I don’t think we’ve figured out 1-bit LLMs just yet.