logoalt Hacker News

kadobanlast Saturday at 9:45 PM0 repliesview on HN

> That has been known since at least the 1990s with TD-Gammon beating the world champions in Backgammon.

Yeah, I didn't mean to imply that reinforcement learning (or applying it in this way) is novel. It was just important to work out how to apply that to go specifically.

> In a sense, classic chess engines do that, too: alpha-beta-search uses a very weak model (eg just checking for checkmate, otherwise counting material, or what have you) and search to generate a much stronger player. You can use that to generate data for training a better model.

I would say that classic chess AIs specifically don't do the important part. They aren't able to use a worst model to, with computation, train a better model. They can generate training data, but then they have no way to incorporate it back into the AI.