logoalt Hacker News

bob1029last Saturday at 3:29 PM3 repliesview on HN

> The advent of large language models have made this type of content relatively easy to churn out on demand, and the majority of the review articles we receive are little more than annotated bibliographies, with no substantial discussion of open research issues.

I have to agree with their justification. Since "Attention Is All You Need" (2017) I have seen maybe four papers with similar impact in the AI/ML space. The signal to noise ratio is really awful. If I had to pick a semi-related paper published since 2020 that I actually found interesting, it would have to be this one: https://arxiv.org/abs/2406.19108 I cannot think of a close second right now.

All of the machine learning papers are pure slop to me now. The last one I looked at had an abstract that was so long it put me to sleep. Many of these papers aren't attempting basic decorum anymore. Mandatory peer review would fix a lot of this. I don't think it is acceptable for the staff at arXiv to have to endure a Sisyphean mountain of LLM shit. They definitely need to push back.


Replies

an0malouslast Saturday at 4:11 PM

Isn’t the signal to noise problem what journals are supposed to be for? I thought arxiv was supposed to just be a record keeper, to make it easy to share papers and preprints.

Al-Khwarizmilast Saturday at 10:27 PM

You picked the arguably most impactful AI/ML paper of the century so far, no wonder you don't find others with similar impact.

Not every paper can be a world-changing breakthrough. Which doesn't mean that more modest papers are noise (although some definitely are). What Kuhn calls "normal science" is also needed for science to work.

programjameslast Saturday at 3:38 PM

This is only for review/position papers, though I agree that pretty much all ML papers for the past 20 years have been slop. I also consider the big names like, "Adam", "Attention", or "Diffusion" slop, because even thought they are powerful and useful, the presentation is so horrible (for the first two) or they contain major mistakes in the justication of why they work (the last two) that they should never have gotten past review without major rewrites.