logoalt Hacker News

Imanari01/20/20250 repliesview on HN

benchmark performance seems to hold up on the aider benchmark. R1 comes in on the second place with 56.9% behind O1's 61.7%.

https://aider.chat/docs/leaderboards/