logoalt Hacker News

dkdcioyesterday at 12:11 AM1 replyview on HN

…except we know what every neuron in a neural network is doing. I ask again, what criteria do we need to meet for you to claim we know how LLMs work?

we know the equations, we know the numbers going through a network, we know the universal approximation theorem —- what’re you looking for exactly?

I’ve answered the “what have they learnt” bit; a function that predicts the next token based on data. what more do you need?


Replies

FeepingCreatureyesterday at 12:24 AM

Yes, in the analogy it's equivalent to saying you know "what" every instruction in the compression program is doing. push decrements rsp, xor rax, rax zeroes out the register. You know every step. But you don't know the algorithm that those instructions are implementing, and that's the same situation we're in with LLMs. We can describe their actions numerically, but we cannot describe them behaviorally, and they're doing things that we don't know how to otherwise do with numerical methods. They've clearly learnt algorithms but we cannot yet formalize what they are. The universal approximation theorem actually works against your argument here, because it's too powerful- they could be implementing anything.

edit: We know the data that their function outputs, it's a "blurry jpeg of the internet" because that's what they're trained on. But we do not know what the function is, and being able to blurrily compress the internet into a tb or whatever is utterly beyond any other compression algorithm known to man.