logoalt Hacker News

tossandthrowyesterday at 10:21 AM1 replyview on HN

We are getting closer!

What other optimizations are there that can be used than what explicitly falls into the 4 categories that the top commenter here listed out?


Replies

mirekrusinyesterday at 1:32 PM

For inference assorted categories may include vectorization, register allocation, scheduling, lock elision, better algos, changing complexity, better data structures, profile guided specialization, layout/alignment changes, compression, quantization/mixed precision, fused kernels (goes beyond inlining), low rank adapters, sparsity, speculative decoding, parallel/multi token decoding, better sampling, prefill/decode separation, analog computation (why not) etc etc.

There is more to it, mentioned 4 categories are not the only ones, they are not even broad categories.

If somebody likes broad categories here is good one: "1s and 0s" and you can compute anything you want, there you go – single category for everything. Is it meaningful? Not really.

show 1 reply