FrontierCode is likely paid for by anthropic.
did they not pay them enough to get good ratings on the other 3 models?
whats the logic in claiming its a borked metric when everything listed is an anthropic model.
Huh? It's a benchmark by Cognition which (1) is building their own models and (2) offers all providers and thus has an incentive to avoid hyping up any one too much.
did they not pay them enough to get good ratings on the other 3 models?
whats the logic in claiming its a borked metric when everything listed is an anthropic model.