logoalt Hacker News

itaketoday at 12:54 AM4 repliesview on HN

Just a guess, but they may store the original ID card to audit duplicate accounts.

If their machine learning models, think that two people are the exact same, having the original image, especially a photo of the same ID card could confirm that.


Replies

selcukatoday at 1:32 AM

There are image processing methods for hashing people's faces. They don't have to store the actual photo to do that.

show 1 reply
dathinabtoday at 1:53 AM

IMHO this is a pretty dump approach to the problem

while there probably are some countries with terrible designed passport for most they are designed to be machine readable even with very old style (like >10year old tech) OCR systems

so even if you want to do something like that you can extract all relevant information and just store that, maybe als extract the image

this seems initially pointless, but isn't, if you store a copy of a photo of a people can use that to impersonate someone, if you only steel the information on it it's harder

outside of impersonation issues another problem is that it's not uncommon that technically ids/passports count as property of the state and you might not be allowed to store full photo copies of it and the person they are for can't give you permission for it either (as they don't own the passport technically speaking). Most times that doesn't matter but if a country wants to screw with you holding images of ids/passports is a terrible idea.

but then you also should ask yourself what degree of "duplicate" protection you actually need wich isn't a perfect one. If someone can circumvent it by spending multiple thousands to endup with a new full name + fudged id image this isn't something a company like discord really needs to care about. Or in other word storing a subset of the information on a passport, potentially hashed, is sufficient for like way over 90% of all companies needs for secondary account prevention.

in the end the reason a company might store a whole photo is because it's convenient and you can retrospectively apply whatever better model you want to use and in many places the penalties for a data breach aren't too big. So you might even start out with "it's bad but we only do so for a short time while building a better system" situation, and then due to the not so threatening consequence of not fixing it (or awareness) it is constantly de-prioritized and never happens...

Gigachadtoday at 1:39 AM

Just store the name and the fact that it was verified and delete the photo. You get what you need without holding on to a massive liability.

show 1 reply
fuzzfactortoday at 1:15 AM

The best years online were when it was universally recognized that government ID's are completely unsuitable for interaction with the internet in any way.

Like it was since the beginning when government ID's first became a thing.