LLMs are text model, not world models and that is the root cause of the problem. If you and I would ...

jacquesm • today at 3:25 PM • 2 replies • view on HN

LLMs are text model, not world models and that is the root cause of the problem. If you and I would be discussing furniture and for some reason you had assumed the furniture to be glued to the ceiling instead of standing on the floor (contrived example) then it would most likely only take one correction based on your actual experience that you are probably on the wrong track. An LLM will happily re-introduce that error a few ping-pongs later and re-establish the track it was on before because that apparently is some kind of attractor.

Not having a world model is a massive disadvantage when dealing with facts, the facts are supposed to re-inforce each other, if you allow even a single fact that is nonsense then you can very confidently deviate into what at best would be misguided science fiction, and at worst is going to end up being used as a basis to build an edifice on that simply has no support.

Facts are contagious: they work just like foundation stones, if you allow incorrect facts to become a part of your foundation you will be producing nonsense. This is my main gripe with AI and it is - funny enough - also my main gripe with some mass human activities.

Replies

coldtea • today at 3:30 PM

>LLMs are text model, not world models and that is the root cause of the problem.

Is it though? In the end, the information in the training texts is a distilled proxy for the world, and the weighted model ends up being a world model, just an once-removed one.

Text is not that different to visual information in that regard (and humans base their world model on both).

>Not having a world model is a massive disadvantage when dealing with facts, the facts are supposed to re-inforce each other, if you allow even a single fact that is nonsense then you can very confidently deviate into what at best would be misguided science fiction, and at worst is going to end up being used as a basis to build an edifice on that simply has no support.

Regular humans believe all kinds of facts that are nonsense, many others that are wrong, and quite a few that are even counter to logic too.

And short of omnipresense and omniscience, directly examining the whole world, any world model (human or AI), is built on sets of facts many of which might not be true or valid to begin with.

➕ show 1 reply

lubujackson • today at 5:51 PM

The "world model" is what we often refer to as the "context". But it is hard to anticipate bad assumptions that seem obvious because of our existing world model. One of the first bugs I scanned past from LLM generated code was something like:

if user.id == "id": ...

Not anticipating that it would arbitrarily put quotes around a variable name. Other time it will do all kinds of smart logic, generate data with ids then fail to use those ids for lookups, or something equally obvious.

The problem is LLMs guess so much correctly that it is near impossible to understand how or why they might go wrong. We can solve this with heavy validation, iterative testing, etc. But the guardrails we need to actually make the results bulletproof need to go far beyond normal testing. LLMs can make such fundamental mistakes while easily completing complex tasks that we need to reset our expectations for what "idiot proofing" really looks like.

➕ show 1 reply

alt Hacker News

Replies