Just because it's not that hard to reach a high-level understanding of the transformer pipeline...

pegasus • yesterday at 12:47 PM • 1 reply • view on HN

Just because it's not that hard to reach a high-level understanding of the transformer pipeline doesn't mean we understand how these systems function, or that there can be no form of world model that they are developing. Recently there has been more evidence for that particular idea [1]. The feats of apparent intelligence LLMs sometimes display have taken even their creators by surprise. Sure, there's a lot of hype too, that's part and parcel of any new technology today, but we are far from understanding what makes them perform so well. In that sense, yeah you could say they are a bit "magical".

[1] https://the-decoder.com/new-othello-experiment-supports-the-...

Replies

ath3nd • yesterday at 2:01 PM

> Just because it's not that hard to reach a high-level understanding of the transformer pipeline doesn't mean we understand how these systems function

Mumbo jumbo magical thinking.

They perform so well because they are trained on probabilistic token matching.

Where they perform terribly, e.g math, reasoning, they are delegating to other approaches, and that's how you get the illusion that there is actually something there. But it's not. Faking intelligence is not intelligence. It's just text generation.

> In that sense, yeah you could say they are a bit "magical"

Nobody but the most unhinged hype pushers are calling it "magical". The LLM can never ever be AGI. Guessing the next word is not intelligence.

> there can be no form of world model that they are developing

Kind of impossible to form a world model if your foundation is probabilistic token guessing which is what LLMs are. LLMs are a dead end in achieving "intelligence", something novel as an approach needs to be discovered (or not) to go into the intelligence direction. But hey, at least we can generate text fast now!

➕ show 1 reply

alt Hacker News

Replies