logoalt Hacker News

Dylan16807yesterday at 3:17 AM1 replyview on HN

Unless I'm missing what you mean by a mile, this isn't true at all. We have infinitely precise models for the outcomes of LLMs because they're digital. We are also able to engineer them pretty effectively.


Replies

famouswafflesyesterday at 3:59 AM

The ML Research world (so this isn't simply a matter of being ignorant/uninformed) was surprised by the performance of GPT-2 and utterly shocked by GPT-3. Why ? Isn't that strange ? Did the transformer architecture fundamentally change between these releases ? No, it did not at all.

So why ? Because even in 2026, nevermind 18 and 19, the only way to really know exactly how a neural network will perform trained with x data at y scale is to train it and see. No elaborate "laws", no neat equations. Modern Artificial Intelligence is an extremely empirical, trial and error field, with researchers often giving post-hoc rationalizations for architectural decisions. So no, we do not have any precise models that tell us how a LLM will respond to any query. If we did, we wouldn't need to spend months and millions of dollars training them.

show 1 reply