logoalt Hacker News

neilellisyesterday at 5:43 PM4 repliesview on HN

Less than a year to destroy Arc-AGI-2 - wow.


Replies

Davidzhengyesterday at 5:52 PM

I unironically believe that arc-agi-3 will have a introduction to solved time of 1 month

show 3 replies
modelessyesterday at 7:29 PM

It's still useful as a benchmark of cost/efficiency.

XCSmeyesterday at 6:51 PM

But why only a +0.5% increase for MMMU-Pro?

show 2 replies
saberienceyesterday at 6:48 PM

It's a useless meaningless benchmark though, it just got a catchy name, as in, if the models solve this it means they have "AGI", which is clearly rubbish.

Arc-AGI score isn't correlated with anything useful.

show 3 replies