logoalt Hacker News

nsingh2today at 6:13 PM0 repliesview on HN

That seems to only be true for the "Agentic Search" benchmark. That benchmark in particular is a bit weird, because Sonnet 4.6 effort levels had a relatively small effect, so Sonnet 5 med is basically comparable to all effort levels of Sonnet 4.6.