logoalt Hacker News

kadobanlast Saturday at 7:39 AM1 replyview on HN

> But the resolution there was MCTS

MCTS wasn't _really_ the solution to go. MCTS-based AIs existed for years and they weren't _that_ good. They weren't superhuman for sure, and the moves/games they played were kind of boring.

The key to doing go well was doing something that vaguely looks like MCTS but the real guts are a network that can answer: "who's winning?" and "what are good moves to try here?" and using that to guide search. Additionally essential was realizing that computation (run search for a while) with a bad model could be effectively+efficiently used to generate better training data to train a better model.


Replies

erulast Saturday at 9:13 AM

> Additionally essential was realizing that computation (run search for a while) with a bad model could be effectively+efficiently used to generate better training data to train a better model.

That has been known since at least the 1990s with TD-Gammon beating the world champions in Backgammon. See eg http://incompleteideas.net/book/ebook/node108.html or https://en.wikipedia.org/wiki/TD-Gammon

In a sense, classic chess engines do that, too: alpha-beta-search uses a very weak model (eg just checking for checkmate, otherwise counting material, or what have you) and search to generate a much stronger player. You can use that to generate data for training a better model.

show 1 reply