Models based on RL are still just remixers as defined above, but their distribution can cover things...

energy123 • today at 5:02 AM • 3 replies • view on HN

Models based on RL are still just remixers as defined above, but their distribution can cover things that are unknown to humans due to being present in the synthetic training data, but not present in the corpus of human awareness. AlphaGo's move 37 is an example. It appears creative and new to outside observers, and it is creative and new, but it's not because the model is figuring out something new on the spot, it's because similar new things appeared in the synthetic training data used to train the model, and the model is summoning those patterns at inference time.

Replies

trick-or-treat • today at 5:34 AM

> the model is summoning those patterns at inference time.

You can make that claim about anything: "The human isn't being creative when they write a novel, they're just summoning patterns at typing time".

AlphaGo taught itself that move, then recalled it later. That's the bar for human creativity and you're holding AlphaGo to a higher standard without realizing it.

➕ show 1 reply

smokel • today at 7:50 AM

No. AlphaGo does search, and does so imperfectly. It does come up with creative new patterns not seen before.

pu_pe • today at 9:29 AM

How do you know that? We don't have access to the logs to know anything about its training, and it's impossible for it to have trained on every potential position in Go.

alt Hacker News

Replies