This seems like an important caveat to the SWE-bench, but the trend is still clearly AI becoming more and more capable.