logoalt Hacker News

Hizonnerlast Thursday at 7:38 PM1 replyview on HN

> There are some concerns that an individual perceptual hash can be reversed to a create legible image,

Yeah no. Those hashes aren't big enough to encode any real image, and definitely not an image that would actually be either "useful" to yer basic pedo, or recognizable as a particular person. Maybe they could produce something that a diffusion model could refine back into something resembling the original... if the model had already been trained on a ton of similar material.

> If Microsoft wanted to keep both the hash algorithm and even an XOR filter of the hash database proprietary

That algorithm leaked years ago. Third party code generates exactly the same hashes on the same input. There are open-literature publications on creating collisions (which can be totally innocent images). They have no actual secrets left.


Replies

lynndotpylast Thursday at 11:52 PM

> > There are some concerns that an individual perceptual hash can be reversed to a create legible image,

> Yeah no.

Well, kind of. Towards Data Science had an article on it that they've since removed:

https://web.archive.org/web/20240219030503/https://towardsda...

And this newer paper: https://eprint.iacr.org/2024/1869.pdf

They're not very good at all (it just uses a GAN over a recovered bitmask), but it's reasonable for Microsoft to worry that every bit in that hash might be useful. I wouldn't want to distribute all those hashes on a hunch they could never be be used to recover images. I don't think any such thing would be possible, but that's just a hunch.

That said, I can't speak on the latter claim without a source. My understanding is that PhotoDNA still has proprietary implementation details that aren't generally available.

show 1 reply