What is the right understanding of how LLMs work and what is the correct diagnosis?

maxbond • yesterday at 9:11 PM • 1 reply • view on HN

Replies

As I said, I believe statistical physics is a very good intuitional guidance. Molecules move randomly. That does not mean a cup of water will spontaneously boil itself. Sometimes the probability of something happening is so low that even if it's not mathematically zero it does not matter because you'll never observe it in the known universe.

LLM generating each token probabilistically does not mean there's a realistic chance of generating any random stuff, where we can define "realistic" as "If we transform the whole known universe into data centers and run this model until the heat death of the universe, we will encounter it at least once."

Of course that does not mean LLMs are infallible. It fails all the time! But you can't explain it as a fundamental shortcoming of a probabilistic structure: that's not a logical argument.

Or, back to the original discussion, the fact that this one particular LLM generated a command to delete the database is not a fundamental shortcoming of LLM architecture. It's just a shortcoming of LLMs we currently have.

➕ show 1 reply

alt Hacker News

Replies