logoalt Hacker News

qwesr123yesterday at 1:59 PM0 repliesview on HN

FYI the MarginLab Claude Code degradation tracker is showing a statistically significant ~4% drop in SWE-Bench-Pro accuracy over the past month