logoalt Hacker News

echelonlast Saturday at 3:04 AM2 repliesview on HN

AI models are a form of compression.

Neural compression wouldn't be like HVEC, operating on frames and pixels. Rather, these techniques can encode entire features and optical flow, which can explain the larger discrepancies. Larger fingers, slightly misplaced items, etc.

Neural compression techniques reshape the image itself.

If you've ever input an image into `gpt-image-1` and asked it to output it again, you'll notice that it's 95% similar, but entire features might move around or average out with the concept of what those items are.


Replies

jsheardlast Saturday at 3:09 AM

Maybe such a thing could exist in the future, but I don't think the idea that YouTube is already serving a secret neural video codec to clients is very plausible. There would be much clearer signs - dramatically higher CPU usage, and tools like yt-dlp running into bizarre undocumented streams that nothing is able to play.

show 2 replies
justincliftlast Saturday at 4:15 AM

The resources required for putting AI <something> inline in the input (upload) or output (download) chain would likely dwarf the resources needed for the non-AI approaches.