logoalt Hacker News

grandinquistoryesterday at 4:05 PM0 repliesview on HN

looking at the system card for opus 4.7 the MCRC benchmark used for long context tasks dropped significantly from 78% to 32%

I wonder what caused such a large regression in this benchmark