Note that GPT 5.2 newly supports a "xhigh" reasoning level, which could explain the better...

minimaxir • yesterday at 6:35 PM • 2 replies • view on HN

Note that GPT 5.2 newly supports a "xhigh" reasoning level, which could explain the better benchmarks.

It'll be noteworthy to see the cost-per-task on ARC AGI v2.

Replies

> It'll be noteworthy to see the cost-per-task on ARC AGI v2.

Already live. gpt-5.2-pro scores a new high of 54.2% with a cost/task of $15.72. The previous best was Gemini 3 Pro (54% with a cost/task of $30.57).

The best bang-for-your-buck is the new xhigh on gpt-5.2, which is 52.9% for $1.90, a big improvement on the previous best in this category which was Opus 4.5 (37.6% for $2.40).

https://arcprize.org/leaderboard

➕ show 1 reply

walletdrainer • yesterday at 7:44 PM

5.1-codex supports that too, no? Pretty sure I’ve been using xhigh for at least a week now

alt Hacker News

Replies