logoalt Hacker News

i_have_an_ideayesterday at 8:35 PM1 replyview on HN

Is it a frontier player though, or perhaps a new benchmaxxed model? People were saying similar things about Grok but it ultimately amounted to little.


Replies

wasabi991011yesterday at 8:55 PM

"preferred by humans over Sonnet 4.6" makes it pretty clearly not benchmaxxed though.

At least when you define benchmaxxed as "good in benchmarks but not human preference".