>because it would try to play like the average human plays in it's database. Why would it ...

charcircuit • last Saturday at 9:15 AM • 0 replies • view on HN

>because it would try to play like the average human plays in it's database.

Why would it play like the average? LLMs pick tokens to try and maximize a reward function, they don't just pick the most common word from the training data set.

alt Hacker News