logoalt Hacker News

candiddevmikeyesterday at 10:50 PM3 repliesview on HN

I feel like this is such a tragedy of the commons for the LLM providers. Wikipedia probably makes up a huge bulk of their dataset, why taint it? Would be interesting if there was some kind of "you shall not use our platform on Wikipedia" stance adopted.


Replies

kingstnaptoday at 12:48 AM

Wikipedia having incorrect citations is way older than LLMs. As many other people have pointed out in this thread, if you start pulling strings a lot of what people write starts falling apart.

Its not even unique to Wikipedia. Its really not difficult to find very misleading statements cited through a citation that doesn't even support the claim when you check the original.

show 1 reply
ohyoutravelyesterday at 10:53 PM

I don’t think it’s the providers doing this, it’s the awful users. They’re doing the same thing on GitHub. It’s maddening.

MattGaiseryesterday at 11:04 PM

It would be random individuals.