This is a fun idea. What surprised me is the inversion where MUL ends up faster than ADD because the neural LUT removes sequential dependency while the adder still needs prefix stages.
Out of curiosity, how much slower is this than an actual CPU?
"Multiplication is 12x faster than addition..."
Wow. That's cool but what happens to the regular CPU?
Well GPU are just special purpous CPU.
Being able to perform precise math in an LLM is important, glad to see this.
As foretold six years ago. [1]
[1]: https://breandan.net/2020/06/30/graph-computation#roadmap