logoalt Hacker News

majormajortoday at 4:58 AM0 repliesview on HN

> Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that.

Are there newer changes that are actually doing prediction of tokens out of order or such, or are this a case of immense internal model state tracking but still using it to drive the prediction of a next token, one at a time?

(Wrapped in a variety of tooling/prompts/meta-prompts to further shape what sorts of paragraphs are produced compared to ye olden days of the gpt3 chat completion api.)