logoalt Hacker News

NBJacklast Monday at 6:44 PM3 repliesview on HN

In other words, they learn the game, not how to play games.


Replies

fsmvlast Monday at 6:51 PM

They memorize the answers not the process to arrive at answers

show 2 replies
IshKebablast Monday at 6:59 PM

Well yeah... If you only ever played one game in your life you would probably be pretty shit at other games too. This does not seem very revealing to me.

show 2 replies
beefnugslast Monday at 7:47 PM

yeahhhh why isnt there a training structure where you play 5000 games, and the reward function is based on doing well in all of them?

I guess its a totaly different level of control: instead of immediately choosing a certain button to press, you need to set longer term goals. "press whatever sequence over this time i need to do to end up closer to this result"

There is some kind of nested multidimensional thing to train on here instead of immediate limited choices