logoalt Hacker News

cubefoxyesterday at 8:24 PM0 repliesview on HN

Yeah. One example of people mindlessly mass citing some random paper is this: Chain of thought (CoT) prompting was used in the past to greatly enhance the reasoning ability of LLMs. Usually this paper is cited when CoT is discussed:

https://arxiv.org/abs/2201.11903

It has over 20,000 citations according to Google Scholar. But clearly the technique was not invented by these authors. It was known 1.5 years earlier, just after GPT-3 came out:

https://xcancel.com/kleptid/status/1284069270603866113#m

Perhaps even longer. But the paper above is cited nonetheless. Probably because there is pressure to cite something and the title of that paper sounds like they pioneered it. I doubt many people who cite it have even read it.

Another funny example is that in machine learning and some other fields, a success measure named "Matthews Correlation Coefficient" (MCC) is used. It's named after some biochemist, Brian Matthews, who used it in a paper from 1975. Needless to say, he didn't invent it at all, he just used the well-known binary version of the well-known correlation coefficient. People who named the measure "MCC" apparently thought he invented it. Matthews probably just didn't bother to cite any sources himself.