logoalt Hacker News

steve_averytoday at 4:09 AM1 replyview on HN

I'd be interested, but they don't even list any anthropic model on their code review benchmark, so I feel like they haven't really tested their benchmark on SOTA models.


Replies

nomeltoday at 4:26 AM

Whenever I see this, I make the (almost always correct) assumption that the SOTA models had an advantage, with the alternative explanation being a complete lack of awareness of the state of AI (which is very very rare for a tool like this).

With SOTA missing, it also is a strong indicator that someone like you is not the target audience.