This is interesting. If you approach this game as individual moves, the search tree is really deep. ...

RainyDayTmrw • yesterday at 4:55 AM • 2 replies • view on HN

This is interesting. If you approach this game as individual moves, the search tree is really deep. However, most levels can be expressed as a few intermediate goals.

In some ways, this reminds me of the history of AI Go (board game). But the resolution there was MCTS, which wasn't at all what we wanted (insofar as MCTS is not generalizable to most things).

Replies

rtpg • yesterday at 5:30 AM

> However, most levels can be expressed as a few intermediate goals

I think generally the whole thing with puzzle games is that you have to determine the “right” intermediate goals. In fact, the naive intermediate goals are often entirely wrong!

A canonical sokoban-like inversion might be where you have to push two blocks into goal areas. You might think “ok, push one block into its goal area and then push another into it.”

But many of these games will have mechanisms meaning you would first want to push one block into its goal, then undo that for some reason (it might activate some extra functionality) push the other block, and then finally go back and do the thing.

There’s always weird tricks that mean that you’re going to walk backwards before walking forwards. I don’t think it’s impossible for these things to stumble into it, though. Just might spin a lot of cycles to get there (humans do too I guess)

➕ show 1 reply

kadoban • yesterday at 7:39 AM

> But the resolution there was MCTS

MCTS wasn't _really_ the solution to go. MCTS-based AIs existed for years and they weren't _that_ good. They weren't superhuman for sure, and the moves/games they played were kind of boring.

The key to doing go well was doing something that vaguely looks like MCTS but the real guts are a network that can answer: "who's winning?" and "what are good moves to try here?" and using that to guide search. Additionally essential was realizing that computation (run search for a while) with a bad model could be effectively+efficiently used to generate better training data to train a better model.

➕ show 1 reply

alt Hacker News

Replies