logoalt Hacker News

medi8rtoday at 2:02 AM2 repliesview on HN

A transformer tokenizes input, does a bunch of matmul and relu set up in a certain way. It doesn't get to see the raw number (just like you don't when you look at 1+1 you need visual cortex etc. first.)


Replies

Lerctoday at 3:30 AM

Notably the difference is that ten digits is not the same thing as a number. One might say that turning it into a number might be the first step, but Neural nets being what they are, they are liable to produce the correct result without bothering to have a representation any more pure than a list of digits.

I guess the analogy there is that a 74ls283 never really has a number either and just manipulates a series of logic levels.

Filligreetoday at 3:34 AM

So the question is, why do we tokenise it in such a way that it makes everything harder?

show 2 replies