logoalt Hacker News

tialaramex12/07/20252 repliesview on HN

Whether in the 1970s or now, it's too often the case that a paper says "Foo and Bar are X" and cites two sources for this fact. You chase down the sources, the first one says "We weren't able to determine whether Foo is X" and never mentions Bar. The second says "Assuming Bar is X, we show that Foo is probably X too".

The paper author likely believes Foo and Bar are X, it may well be that all their co-workers, if asked, would say that Foo and Bar are X, but "Everybody I have coffee with agrees" can't be cited, so we get this sort of junk citation.

Hopefully it's not crucial to the new work that Foo and Bar are in fact X. But that's not always the case, and it's a problem that years later somebody else will cite this paper, for the claim "Foo and Bar are X" which it was in fact merely citing erroneously.


Replies

KHRZ12/07/2025

LLMs can actually make up for their negative contributions. They could go through all the references of all papers and verify them, assuming someone would also look into what gets flagged for that final seal of disapproval.

But this would be more powerfull with an open knowledge base where all papers and citation verifications were registered, so that all the effort put into verification could be reused, and errors propagated through the citation chain.

show 1 reply
HPsquared12/07/2025

Wikipedia calls this citogenesis.