logoalt Hacker News

moezdyesterday at 10:10 PM5 repliesview on HN

LLMs are still next token predictors, just because you can give it more vague instructions and it still finds the right steps to follow, it doesn't mean it's intelligent. It means you're speaking the same language as the harness they trained your model on.

And that has a limit. If you are stuck at PoC level or simple apps, you have no idea how limited the current models still are. There you really need to break tasks down, not just trust a token predictor to list steps that sound good. There has to be a human in the loop somewhere, because by the time you start skipping permissions, best case you get the jackpot, more likely is you get a suboptimal solution and token waste and what's genuinely still terrifying when the model ignores instructions and does some stupid nonsense, ruining your day. It really is as sharp as a CNC machine. It's not not useful, but could be dangerous, so maybe don't try to carve wood with a monster machine, or park your Ferrari in that crammed neighbourhood if you don't know how to parallel park.


Replies

ACCount37yesterday at 10:54 PM

"Next token prediction" is an interface, not an algorithm. A process that "predicts next tokens" can be arbitrarily complex or simple, and arbitrarily capable or incapable of performing a given task.

Saying that an LLM can or can't do something because it's a "token predictor" is a category error. The interface isn't a hard limit.

joenot443today at 2:43 AM

What would you say is your benchmark for calling something intelligent?

show 1 reply
infinite_spinyesterday at 10:42 PM

> it doesn't mean it's intelligent

I'm not sure how you're defining "intelligent", but I'd like to know how it is able to exclude a language model, while still including humans, without simply defining it with an axiom that predefines LLMs as lacking intelligence.

show 2 replies
semiquaveryesterday at 10:35 PM

Yeah, and you’re just a next-word-sayer.

show 2 replies
MoltenMantoday at 12:17 AM

Calling LLMs 'next token predictors' is completely reductive and disingenuous; it's true that technically that is what they're doing, but so are you! What people generally mean by this though is that they're just 'predicting the next token of their training [i.e. the internet]'. If you were talking about the raw models, this would actually be true; but the models are post trained, so even this description isn't true at all anymore! Saying they aren't 'intelligent' is both not useful and (imo) wrong. Who cares if it matches your definition of 'intelligent'; it still gets impressive stuff done, much more impressive stuff than you seem to be implying.