AI generally can improve through reinforcement learning, but this requires it to be able to compare ...

zahlman • today at 12:26 AM • 0 replies • view on HN

AI generally can improve through reinforcement learning, but this requires it to be able to compare its output to some form of metric. There aren't a lot of people I'd trust to RLHF for code quality, and anything more automated than that is destined to collapse due to Goodhart's Law.

alt Hacker News