logoalt Hacker News

hyperhellotoday at 1:57 AM1 replyview on HN

From context then, I infer that a transformer is not comprised of matrix multiplications, because it would simply be one that adds two 10-digit numbers.


Replies

medi8rtoday at 2:02 AM

A transformer tokenizes input, does a bunch of matmul and relu set up in a certain way. It doesn't get to see the raw number (just like you don't when you look at 1+1 you need visual cortex etc. first.)

show 2 replies