A lot of the confusion in this thread feels like it comes from thinking in terms of web streaming rather than the workloads this post is targeting.
The article is pretty explicit that this is not about "make Twitch more efficient" or squeezing a bit more perf out of H.264. It is about mezzanine and archival formats that are already way beyond what a single CPU, even a decade old workstation CPU, handles comfortably in real time: 4K/6K/8K+ 16‑bit, FFv1-style lossless, ProRes RAW, huge DPX sequences, etc. People cutting multi‑camera timelines of that kind of material are already on the wrong side of the perf cliff and are often forced into very specific hardware or vendors.
What Vulkan compute buys you here is not "GPUs good, CPUs bad", it is the ability to keep the entire codec pipeline resident on the GPU once the bitstream is there, using the same device that is already doing color, compositing and FX, and to do it in a portable way. FFmpeg’s model is also important: all the hairy parts stay in software (parsing, threading, error handling), and only the hot pixel crunching is offloaded. That makes this much more maintainable than the usual fragile vendor API route and keeps a clean fallback path when hardware is not available.
From a practical angle, this is less about winning a benchmark over a good CPU encoder for 4K H.264, and more about changing what is feasible on commodity hardware: e.g., scrubbing multiple streams of 6K/8K ProRes or FFv1 on a consumer GPU instead of needing a fat workstation or dailies transcoded to lighter proxies. For people doing archival work or high end finishing on a budget, that is a real qualitative change, not just an incremental efficiency tweak.