logoalt Hacker News

4b11b4yesterday at 5:22 PM0 repliesview on HN

This isn't quite RL, right...? It's an evolutionary approach on specifically labeled sections of code optimizing towards a set of metrics defined by evaluation functions written by a human.

I suppose you could consider that last part (optimizing some metric) "RL".

However, it's missing a key concept of RL which is the exploration/exploitation tradeoff.