logoalt Hacker News

baubinotoday at 9:26 AM0 repliesview on HN

The problem is that LLM “summaries” do not cite sources. They furthermore don’t distinguish between making summaries and taking direct quotes; that “summary” is often directly lifting text that someone wrote. LLMs don’t cite in either case. It’s a clear case of plagiarism, but tech companies are being allowed to get away with it.

Publishing in a paid publication is not a solution because tech companies are scraping those too. It’s absolutely criminal. As an individual, I would be in clear violation of the law if I took text someone else wrote (even if that text was in the public domain) and presented it as my own without attribution.

From an academic perspective, LLM summaries also undermine the purpose of having clear and direct attribution for ideas. Citing sources not only makes clear who said what; it also allows the reader to know who is responsible for faulty knowledge. I’ve already seen this in my line of work, where LLMs have significantly boosted incorrect data. The average reader doesn’t know this data is incorrect and in fact can’t verify any of the data because there is no attribution. This could have serious consequences in areas like medicine.