logoalt Hacker News

fchollettoday at 4:25 AM3 repliesview on HN

It is 100% ARC-AGI-3 specific though, just read through the prompts https://github.com/symbolica-ai/ARC-AGI-3-Agents/blob/symbol...


Replies

boxedtoday at 6:01 AM

What a dick move. Making that prompt open source will probably mean that every other model that doesn't want to cheat will scrape that and accidentally cheat in the next models.

diwanktoday at 6:23 AM

this is so disingenuous on symbolica's part. these insincere announcements just make it harder for genuine attempts and novel ideas

DetroitThrowtoday at 5:11 AM

Um, yes this is a extremely specific as a benchmark harness. It has a ton of knowledge encoded about the tasks at hand. The tweet is dishonest even in the best light.

The hard part of these tests isn't purely reasoning ability ffs.