It doesn't have to be _much more intelligent_ than Opus to be a risk. It doesn't even need to be _more intelligent_. It just needs to be _better at finding security problems_. Which could happen from just minor improvements in training data, or the harness, etc. Even a small improvement could shift it from finding very few new security holes, to reliably finding many at scale.
Yeah, I think a lot of the disconnect here is that people think of "model intelligence" as some sort of IQ score, rather than a combination of scores that measure abilities at a large variety of domains.