logoalt Hacker News

verdvermyesterday at 6:01 PM1 replyview on HN

Here's a good thread over 1+ month, as each model comes out

https://bsky.app/profile/pekka.bsky.social/post/3meokmizvt22...

tl;dr - Pekka says Arc-AGI-2 is now toast as a benchmark


Replies

Aperockyyesterday at 6:24 PM

If you look at the problem space it is easy to see why it's toast, maybe there's intelligence in there, but hardly general.

show 2 replies