We haven't hit the RSI threshold yet and so evolution is so slow that it's usually terminated as not-useful or it solves a concrete problem and is terminated by itself or a human. Earlier model+frameworks merely petered out almost immediately. I'm guessing it's roughly correlated with the progress on METR.