looking at the system card for opus 4.7 the MCRC benchmark used for long context tasks dropped significantly from 78% to 32%
I wonder what caused such a large regression in this benchmark