logoalt Hacker News

chaos_emergenttoday at 6:49 PM0 repliesview on HN

Human-driven research is also brute-force but with a more efficient search strategy. One can think of a parameter that represents research-search-space-navigation efficiency. RL-trained agents will inevitably optimize for that parameter. I agree with your statement insomuch as the value of that efficiency parameter is lower for agents than humans today.

It's really hard to imagine that they __won't__ exceed the human value for that efficiency parameter rather soon given that 1. there are plenty of scalar value functions that can represent research efficiency, of which a subset will result in robust training, and 2. that AI labs have a massive incentive to increase their research efficiency overall, along with billions of dollars and really good human researchers working on the problem.