logoalt Hacker News

root_axisyesterday at 4:04 AM7 repliesview on HN

> Don’t let some factoid about how they are pretrained on autocomplete-like next token prediction fool you into thinking you understand what is going on in that trillion parameter neural network.

This is just an appeal to complexity, not a rebuttal to the critique of likening an LLM to a human brain.

> they are not “autocomplete on steroids” anymore either.

Yes, they are. The steroids are just even more powerful. By refining training data quality, increasing parameter size, and increasing context length we can squeeze more utility out of LLMs than ever before, but ultimately, Opus 4.5 is the same thing as GPT2, it's only that coherence lasts a few pages rather than a few sentences.


Replies

int_19hyesterday at 8:11 AM

> ultimately, Opus 4.5 is the same thing as GPT2, it's only that coherence lasts a few pages rather than a few sentences.

This tells me that you haven't really used Opus 4.5 at all.

nltoday at 1:21 AM

This ignores that reinforcement learning radically changes the training objective

baqyesterday at 6:00 AM

First, this is completely ignoring text diffusion and nano banana.

Second, to autocomplete the name of the killer in a detective book outside of the training set requires following and at least some understanding of the plot.

show 1 reply
libraryofbabelyesterday at 4:58 PM

> This is just an appeal to complexity, not a rebuttal to the critique of likening an LLM to a human brain

I wasn’t arguing that LLMs are like a human brain. Of course they aren’t. I said twice in my original post that they aren’t like humans. But “like a human brain” and “autocomplete on steroids” aren’t the only two choices here.

As for appealing to complexity, well, let’s call it more like an appeal to humility in the face of complexity. My basic claim is this:

1) It is a trap to reason from model architecture alone to make claims about what LLMs can and can’t do.

2) The specific version of this in GP that I was objecting to was: LLMs are just transformers that do next token prediction, therefore they cannot solve novel problems and just regurgitate their training data. This is provably true or false, if we agree on a reasonable definition of novel problems.

The reason I believe this is that back in 2023 I (like many of us) used LLM architecture to argue that LLMs had all sorts of limitations around the kind of code they could write, the tasks they could do, the math problems they could solve. At the end of 2025, SotA LLMs have refuted most of these claims by being able to do the tasks I thought they’d never be able to do. That was a big surprise to a lot us in the industry. It still surprises me every day. The facts changed, and I changed my opinion.

So I would ask you: what kind of task do you think LLMs aren’t capable of doing, reasoning from their architecture?

I was also going to mention RL, as I think that is the key differentiator that makes the “knowledge” in the SotA LLMs right now qualitatively different from GPT2. But other posters already made that point.

This topic arouses strong reactions. I already had one poster (since apparently downvoted into oblivion) accuse me of “magical thinking” and “LLM-induced-psychosis”! And I thought I was just making the rather uncontroversial point that things may be more complicated than we all thought in 2023. For what it’s worth, I do believe LLMs probably have limitations (like they’re not going to lead to AGI and are never going to do mathematics like Terence Tao) and I also think we’re in a huge bubble and a lot of people are going to lose their shirts. But I think we all owe it to ourselves to take LLMs seriously as well. Saying “Opus 4.5 is the same thing as GPT2” isn’t really a pathway to do that, it’s just a convenient way to avoid grappling with the hard questions.

dash2yesterday at 4:38 AM

This would be true if all training were based on sentence completion. But training involving RLHF and RLAIF is increasingly important, isn't it?

show 1 reply
A4ET8a8uTh0_v2yesterday at 4:16 AM

But.. and I am not asking it for giggles, does it mean humans are giant autocomplete machines?

show 1 reply
NiloCKyesterday at 5:38 AM

First: a selection mechanism is just a selection mechanism, and it shouldn't confuse the observation of an emergent, tangential capabilities.

Probably you believe that humans have something called intelligence, but the pressure that produced it - the likelihood of specific genetic material to replicate - it is much more tangential to intelligence than next-token-prediction.

I doubt many alien civilizations would look at us and say "not intelligent - they're just genetic information replication on steroids".

Second: modern models also under go a ton of post-training now. RLHF, mechanized fine-tuning on specific use cases, etc etc. It's just not correct that token-prediction loss function is "the whole thing".

show 1 reply