this example worked in 2021, it's 2026. wake up. these models are not just "finding the mo...

lkeskull • today at 7:41 AM • 3 replies • view on HN

this example worked in 2021, it's 2026. wake up. these models are not just "finding the most likely next word based on what they've seen on the internet".

Replies

strix_varius • today at 7:49 AM

Well, yes, definitionally they are doing exactly that.

It just turns out that there's quite a bit of knowledge and understanding baked into the relationships of words to one another.

LLMs are heavily influenced by preceding words. It's very hard for them to backtrack on an earlier branch. This is why all the reasoning models use "stop phrases" like "wait" "however" "hold on..." It's literally just text injected in order to make the auto complete more likely to revise previous bad branches.

jaccola • today at 8:11 AM

The person above was being a bit pedantic, and zealous in their anti-anthropomorphism.

But they are literally predicting the next token. They do nothing else.

Also if you think they were just predicting the next token in 2021, there has been no fundamental architecture change since then. All gains have been via scale and efficiency optimisations (not to discount that, an awful lot of complexity in both of these)

➕ show 1 reply

csomar • today at 8:28 AM

Unless LLMs architecture have changed, that is exactly what they are doing. You might need to learn more how LLMs work.

➕ show 1 reply

alt Hacker News

Replies