The only data that cannot be stolen or leaked is data that doesn't exist. Hard lesson for both users and companies.
Germans (because of course) have a word for this: "Datensparsamkeit". Being frugal with your data.
I miss the pre-LLM days when you could make a decent argument that having any unnecessary data was just a liability. Now all anybody thinks is “more data for the AI!”
> The only data that cannot be stolen or leaked is data that doesn't exist. Hard lesson for both users and companies.
Except no company is learning this lesson.
The enterprise threat model includes "our own users", and the modus operandi is to maintain as much information on that threat as possible.
Data that is publicly available also can't be stolen or leaked. Nobody can steal Mozilla's common voice dataset.
Data can never be stolen, because it is not a physical thing. Data can be copied, and it can be erased - sometimes both happens at the same time. Data can be lost, that is when its last existing copy was erased.
The only winning move is not to play.
Seems a bit like blaming the victim? Your voice (like DNA) is kind of ambient data that's hard to hide.
> Germans (because of course)
I don't know if it's the reason you imply. In the 70s, there were big debates in Germany about privacy and data storage. They spoke of one's data shadow (Datenschatten). I suspect this word comes from that tradition. The reason the word exists would then be the reflection (Verwaltigung) on WW2.