Frankly i see no reason to keep this data private. They should simply publish a full dataset of the census, with no such data coarsening/differential privacy/ etc...
Fundamentally this is public data. If it's to dangerous to make public, it's too dangerous to collect, and people should be aware of exactly what it is.
There are very few things that the state has data on that should not be made public. Census data is simply not one of those things.
publishing should be the default for any data, and to keep it unpublished should require substantially good reasons that impact the country as a whole. Frankly, if it isn't detailed national defence plans, i struggle to see any data that should not be public.
1. People give the information to the government under the expectation that this data is to be kept private or used in such a way that individual targeting is made impossible, you break that expectation and people will lie or won't give you this data.
2. Without noise injection it's rather simple to do statistical attacks to reverse engineer individual entities.
3. This data is and has already been used in the past to undermine democratic systems by targeting and disenfranchising minorities, as well as gerrymandering the US to hell.
4. "Too dangerous to make public, too dangerous to collect" - this is a false dichotomy. To govern effectively you need sensitive data, but it should be collected and used in a way that's safe for the individuals.
5. Macro level aggregates don't need individual exposure, that's why noise, anonymization and statistical functions are fine.
That's a good default position, and I think should be our starting point.
But the devil is in the details. If we don't want advertisers constructing semi-complete profiles from simple web interactions then why would we publish 330 million census questionnaires for their use?
> They should simply publish a full dataset of the census, with no such data coarsening/differential privacy/ etc...
They do. After a substantial delay. Pretty handy for geneological research, while protecting privacy for the living.
Then dox yourself right now with your previous census answers and PII. There are several obvious reasons to keep the data private, all you have to do is use your brain.
Don’t quit your day job. One guess as to what gender, sexual orientation, and skin colour you have.
How hard have you thought about this?
The biggest challenge with running a census is getting people to trust you enough to answer your questions.
A lot of census questions are sensitive. The ACS covers topics like citizenship status, disabilities, income, SNAP assistance, languages spoken at home.
If you want accurate information about the people who live in your country you need the census process to feel as safe for people to respond to as possible.
Are you saying the census shouldn't collect any data that people wouldn't be comfortable publishing? Because that's a recipe for a census that is far less useful for helping the country make useful decisions.