Because it's a neural technique, not one based on pixels or frames.
https://blog.metaphysic.ai/what-is-neural-compression/
Instead of artifacts in pixels, you'll see artifacts in larger features.
https://arxiv.org/abs/2412.11379
Look at figure 5 and beyond.
Like a visual version of psychoacoustic compression. Neat. Thanks for sharing.