logoalt Hacker News

vessenestoday at 4:38 AM0 repliesview on HN

I’m not a Chollet booster. Well, I might be a little bit of one in that I admire his persistence.

I really like these puzzles. There’s a lot to them both in design and scoring — models trained to do well on these are going to be genuinely much more useful, so I’m excited about it. As opposed to -1 and -2, to do well at these, you need to be able to do:

- Visual reasoning

- Path planning (and some fairly long paths)

- Mouse/screen interaction

- color and shape analysis

- cross-context learning/remembering

Probably more, I only did like five or six of these. We really want models that are good at all this; it covers a lot of what current agentic loops are super weak at. So I hope M. Chollet is successful at getting frontier labs to put a billion or so into training for these.