logoalt Hacker News

NitpickLawyeryesterday at 8:59 PM2 repliesview on HN

Umm... what data? That's a very old newsletter-like site. Everything that's public on it has been long scraped and parsed by whoever needed it. There's 0 valuable data there for "parasites" to parasite off of.

I also don't get the comments on the linked social site. IIUC the users posting there are somehow involved with kernel work, right? So they should know a thing or two about technical stuff? How / why are they so convinced that the big bad AI baddies are scraping them, and not some miss-configured thing that someone or another built? Is this their first time? Again, there's nothing there that hasn't been indexed dozens of times already. And... sorry to say it, but neither newsletters nor the 1-3 comments on each article are exactly "prime data" for any kind of training.

These people have gone full tinfoil hat and spewing hate isn't doing them any favours.


Replies

homebreweryesterday at 9:17 PM

Because it started in 2022 and hasn't subsided since? This is just the latest iteration of "AI" scrapers destroying the site, and the worst one yet.

https://lwn.net/Articles/1008897

Your nonsense about LWN being a "newsletter" and having "zero valuable data" isn't doing you any favors. It is the prime source of information about Linux kernel development, and Linux development in general.

"AI" cancer scraping the same thing over and over and over again is not news for anybody even with a cursory interest in this subject. They've been doing it for years.

show 1 reply
MBCookyesterday at 9:01 PM

I don’t think they were talking about LWN specifically but just in general.