logoalt Hacker News

dagssyesterday at 9:08 PM2 repliesview on HN

Isn't talking about "here’s how LLMs actually work" in this context a bit like saying "a human can't be a relevant to X because a brain is only a set of molecules, neurons, synapses"?

Or even "this book won't have any effect on the world because it's only a collection of letters, see here, black ink on paper, that is what is IS, it can't DO anything"...

Saying LLM is a statistical prediction engine of the next token is IMO sort of confusing what it is with the medium it is expressed in/built of.

For instance those small experiments that train a network on addition problems mentioned in a sibling post. The weights end up forming an addition machine. An addition machine is what it is, that is the emergent behavior. The machine learning weights is just the medium it is expressed in.

What's interesting about LLM is such emergent behavior. Yes, it's statistical prediction of likely next tokens, but when training weights for that it might well have a side-effect of wiring up some kind of "intelligence" (for reasonable everyday definitions of the word "intelligence", such as programming as good as a median programmer). We don't really know this yet.


Replies

ActorNightlyyesterday at 10:07 PM

Its pretty clear that the problem of solving AI is software, I don't think anyone would disagree.

But that problem is MUCH MUCH MUCH harder than people make it out to be.

For example, you can reliably train an LLM to produce accurate output of assembly code that can fit into a context window. However, lets say you give it a Terabyte of assembly code - it won't be able to produce correct output as it will run out of context.

You can get around that with agentic frameworks, but all of those right now are manually coded.

So how do you train an LLM to correctly take any length of assembly code and produce the correct result? The only way is to essentially train the structure of the neurons inside of it behave like a computer, but the problem is that you can't do back-propagation with discrete zero and 1 values unless you explicitly code in the architecture for a cpu inside. So obviously, error correction with inputs/outputs is not the way we get to intelligence.

It may be that the answer is pretty much a stochastic search where you spin up x instances of trillion parameter nets and make them operate in environments with some form of genetic algorithm, until you get something that behaves like a Human, and any shortcutting to this is not really possible because of essentially chaotic effects.

,

show 2 replies
wavemodeyesterday at 10:34 PM

You're putting a bunch of words in the parent commenter's mouth, and arguing against a strawman.

In this context, "here’s how LLMs actually work" is what allows someone to have an informed opinion on whether a singularity is coming or not. If you don't understand how they work, then any company trying to sell their AI, or any random person on the Internet, can easily convince you that a singularity is coming without any evidence.

This is separate from directly answering the question "is a singularity coming?"

show 1 reply