logoalt Hacker News

m3kw910/11/20243 repliesview on HN

The llm will know 123 and 45 is a contiguious number just like how humans can tell if you say 123 and then a slight pause 45 as a single number


Replies

TZubiri10/11/2024

It's just so dissonant to me that the tokens in mathematics are the digits, and not bundles of digits. The idea of tokenization makes sense for taking the power off letters, it provides language agnosticism.

But for maths, it doesn't seem appropriate.

I wonder what the effect of forcing tokenization for each separate digit be.

show 1 reply
soulofmischief10/11/2024

I think that as long as the attention mechanism has been trained on each possible numerical token enough, this is true. But if a particular token is underrepresented, it could potentially cause inaccuracies.

sva_10/11/2024

It won't 'see' [123, 45] though, but [7633, 2548], or rather sparse vectors that are zero at each but the 7634th and 2549th position.