logoalt Hacker News

csomartoday at 8:28 AM1 replyview on HN

Unless LLMs architecture have changed, that is exactly what they are doing. You might need to learn more how LLMs work.


Replies

andy12_today at 9:47 AM

Unless the LLM is a base model or just a finetuned base model, it definitely doesn't predict words just based on how likely they are in similar sentences it was trained on. Reinforcement learning is a thing and all models nowadays are extensively trained with it.

If anything, they predict words based on a heuristic ensemble of what word is most likely to come next in similar sentences and what word is most likely to give a final higher reward.

show 2 replies