why does this read like an openai ad?
These plots are terrible. Why is categorical data connected across categories with lines? Why not just use bar plots?
Like in the "Web Vulns in OSS" plot, white box data for Opus 4.7 is not available, but the absurd linear interpolation across categories implies it should be near 60.
Wasn't it already confirmed that small open-weight models were able to detect most of the same headline vulns as mythos? How is this any different?