logoalt Hacker News

cschmidt06/26/20251 replyview on HN

Math operations go right to left in the text, while we write them left to right. So if you see the digits 123... in an autoreressive manner, you don't know really anything, since it could be 12345 or 1234567. If you flipped 12345 as 543..., you know the place value of each. You know that the 5 you encounter first is in the ones place, the 4 is the tens place, etc. It gives the LLM a better chance of learning arithmetic.


Replies

pas06/28/2025

ah, okay, thanks!

so basically reverse notation has the advantage of keeping magnitude of numbers (digits!) relative to each other constant (or at least anchored to the beginning of the number)

doesn't attention help with this? (or, it does help, but not much? or it falls out of autoregressive methods?)

show 1 reply