> But the resolution there was MCTS
MCTS wasn't _really_ the solution to go. MCTS-based AIs existed for years and they weren't _that_ good. They weren't superhuman for sure, and the moves/games they played were kind of boring.
The key to doing go well was doing something that vaguely looks like MCTS but the real guts are a network that can answer: "who's winning?" and "what are good moves to try here?" and using that to guide search. Additionally essential was realizing that computation (run search for a while) with a bad model could be effectively+efficiently used to generate better training data to train a better model.
> Additionally essential was realizing that computation (run search for a while) with a bad model could be effectively+efficiently used to generate better training data to train a better model.
That has been known since at least the 1990s with TD-Gammon beating the world champions in Backgammon. See eg http://incompleteideas.net/book/ebook/node108.html or https://en.wikipedia.org/wiki/TD-Gammon
In a sense, classic chess engines do that, too: alpha-beta-search uses a very weak model (eg just checking for checkmate, otherwise counting material, or what have you) and search to generate a much stronger player. You can use that to generate data for training a better model.