logoalt Hacker News

jascha_engtoday at 1:28 AM0 repliesview on HN

This mostly reads as a comparison between Opus 4.7 and 4.1 it would be more interesting if they reran the experiment against a team of humans with 4.7 and see how much the humans still improve the results today.