logoalt Hacker News

alvistoday at 6:06 PM2 repliesview on HN

What I starting to hate is that each model's effort level can mean completely different power.

Today sonnet 5's med level effort is equivalent to sonnet 4.6 low level effort :/


Replies

nsingh2today at 6:13 PM

That seems to only be true for the "Agentic Search" benchmark. That benchmark in particular is a bit weird, because Sonnet 4.6 effort levels had a relatively small effect, so Sonnet 5 med is basically comparable to all effort levels of Sonnet 4.6.