What is the "nasty surprise" of Zen 4 AVX512? Sure, it's not quite the twice as fast you might initially assume, but (unlike Intel's downclocking) it's still a strict upgrade over AVX2, is it not?
> it's still a strict upgrade over AVX2
If you benchmark it, it will be slower about half the time.
It's splitting a 512 instruction into 2 256 instructions internally. That's the main nasty surpise.
I suppose it saves on the decoding portion a little but it's ultimately no more effective than just issuing the 2 256 instructions yourself.