logoalt Hacker News

winchester6788last Thursday at 5:22 PM3 repliesview on HN

Author of NudeNet here.

I just scraped data from reddit and other sources so i could build a nsfw classifier and chose to open source the data and the model for general good.

Note that i was a 1 year experienced engineer working solely on this project in my free time, so it was basically impossible for me to review or clear out the few csam images in the 100,000+ images in the dataset.

Although, now i wonder if i should never have open sourced the data. Would have avoided lot of these issues.


Replies

markatlargelast Thursday at 6:02 PM

Im the developers who actually got banned because of this dataset. I used NudeNet offline to benchmark my on-device NSFW app Punge — nothing uploaded, nothing shared.

Your dataset wasn’t the problem. The real problem is that independent developers have zero access to the tools needed to detect CSAM, while Big Tech keeps those capabilities to itself.

Meanwhile, Google and other giants openly use massive datasets like LAION-5B — which also contained CSAM — without facing any consequences at all. Google even used early LAION data to train one of its own models. Nobody bans Google. But when I touched NudeNet for legitimate testing, Google deleted 130,000+ files from my account, even though only ~700 images out of ~700,000 were actually problematic. That’s not safety — that’s a detection system wildly over firing with no independent oversight and no accountability.

Big Tech designed a world where they alone have the scanning tools and the immunity when those tools fail. Everyone else gets punished for their mistakes. So yes — your dataset has done good. ANY data set is subject to this. There needs to be tools and process for all.

But let’s be honest about where the harm came from: a system rigged so only Big Tech can safely build or host datasets, while indie developers get wiped out by the exact same automated systems Big Tech exempts itself from.

show 2 replies
qubexlast Thursday at 6:44 PM

I in no way want to underplay the seriousness of child sexual abuse, but as a naturist I find all this paranoia around nudity and “not safe for work” to be somewhere between hilarious and bewildering. Normal is what you grew up with I guess, and I come from an FKK family. What’s so shocking about a human being? All that stuff in public speaking about “imagine your audience is naked”. Yeah, fine: so what’s Plan B?