logoalt Hacker News

IanCalyesterday at 11:34 PM0 repliesview on HN

A few points:

1. I think you have mixed up assistance and expertise. They talk about not needing a human in the loop for verification and to continue search but not about initial starts. Those are quite different. One well specified task can be attempted many times, and the skill sets are overlapping but not identical.

2. The article is about where they may get to rather than just what they are capable of now.

3. There’s no conflict between the idea that 10 parallel agents of the top models can mostly have one that successfully exploits a vulnerability - gated on an actual test that the exploit works - with feedback and iteration BUT random models pointed at arbitrary code without a good spec and without the ability to run code, and just run once, will generate lower quality results.