Updated practice for review articles and position papers in ArXiv CS category

483 points • by dw64 • last Saturday at 2:58 PM • 228 comments • view on HN

Comments

There is a general problem with rewarding people for the volume of stuff they create, rather than the quality.

If you incentivize researchers to publish papers, individuals will find ways to game the system, meeting the minimum quality bar, while taking the least effort to create the most papers and thereby receive the greatest reward.

Similarly, if you reward content creators based on views, you will get view maximization behaviors. If you reward ad placement based on impressions, you will see gaming for impressions.

Bad metrics or bad rewards cause bad behavior.

We see this over and over because the reward issuers are designing systems to optimize for their upstream metrics.

Put differently, the online world is optimized for algorithms, not humans.

➕ show 5 replies

Sharlin • last Saturday at 3:18 PM

So what they no longer accept is preprints (or rejects…) It’s of course a pretty big deal given that arXiv is all about preprints. And an accepted journal paper presumably cannot be submitted to arXiv anyway unless it’s an open journal.

➕ show 9 replies

amelius • last Saturday at 3:22 PM

Maybe it's time for a reputation system. E.g. every author publishes a public PGP key along with their work. Not sure about the details but this is about CS, so I'm sure they will figure something out.

➕ show 6 replies

DalasNoin • last Saturday at 3:22 PM

it's clearly not sutainable to have the main website hosting CS articles not having any reviews or restrictions. (Except for the initial invite system) There were 26k submission in october: https://arxiv.org/stats/monthly_submissions

Asking for a small amount of money would probably help. Issue with requiring peer reviewed journals or conferences is the severe lag, takes a long time and part of the advantage of arxiv was that you could have the paper instantly as a preprint. Also these conferences and journals are also receiving enormous quantities of submissions (29.000 for AAAI) so we are just pushing the problem.

➕ show 5 replies

thomascountz • last Saturday at 3:11 PM

The HN submission title is incorrect.

> Before being considered for submission to arXiv’s CS category, review articles and position papers must now be accepted at a journal or a conference and complete successful peer review.

Edit: original title was "arXiv No Longer Accepts Computer Science Position or Review Papers Due to LLMs"

➕ show 5 replies

currymj • last Saturday at 3:37 PM

i would like to understand what people get, or think they get, out of putting a completely AI-generated survey paper on arXiv.

Even if AI writes the paper for you, it's still kind of a pain in the ass to go through the submission process, get the LaTeX to compile on their servers, etc., there is a small cost to you. Why do this?

➕ show 3 replies

jruohonen • yesterday at 5:37 PM

One thing I forgot to speculate: a position paper on DEI and Cornell University...

whatpeoplewant • last Saturday at 8:27 PM

Great move by arXiv—clear standards for reviews and position papers are crucial in fast-moving areas like multi-agent systems and agentic LLMs. Requiring machine-readable metadata (type=review/position, inclusion criteria, benchmark coverage, code/data links) and consistent cross-listing (cs.AI/cs.MA) would help readers and tools filter claims, especially in distributed/parallel agentic AI where evaluation is fragile. A standardized “Survey”/“Position” tag plus a brief reproducibility checklist would set expectations without stifling early ideas.

Quizzical4230 • yesterday at 8:50 AM

Shameless plug.

PaperMatch [1] helps solve this problem (large influx of papers) by running a semantic search on top of abstracts, for all of arXiv.

[1]: https://papermatch.me/

generationP • last Saturday at 6:05 PM

I have a hunch that most of the slop is not just on CS but specifically about AI. For some reason, a lot of people's first idea when they encounter an LLM is "let's have this LLM write an opinion piece about LLMs", as if they want to test its self-awareness or hack it by self-recursion. And then they get a medley of the learning data, which if they are lucky contains some technical explanations sprinkled in.

That said, AI-generated papers have already been spotted in other disciplines besides cs, and some of them are really obvious (arXiv:2508.11634v1 starts with a review of a non-existing paper). I really hope arXiv won't react by narrowing its scope to "novel research only"; in fact there is already AI slop in that category and it is harder to spot for a moderator.

("Peer-reviewed papers only" is mostly equivalent to "go away". Authors post on the arXiv in order to get early feedback, not just to have their paper openly accessible. And most journals at least formally discourage authors from posting their papers on the arXiv.)

jruohonen • yesterday at 3:33 PM

A very weird move. They are now taking a stance on what science is supposed to be.

As someone commented, due to the increasing volume, we would actually need and benefit from more reviews -- with a fixed cycle preferably, and I do not mean LLM slop but SLRs. And in contrary to someone's post, it is actually nice to read things from the industry, and I would actually want that more.

And not only are they taking a stance on science but they have also this allegation:

"Please note: the review conducted at conference workshops generally does not meet the same standard of rigor of traditional peer review and is not enough to have your review article or position paper accepted to arXiv."

In fact -- and supposedly related to the peer review crisis, the situation is exactly the opposite. That is, reviews are usually today of much higher quality at specialized workshops organized by experts in a particular, often niche area.

Maybe arXiv people should visit PubPeer once in a while to see what kind of fraud is going on with conferences (i.e., not workshops and usually not review papers) and their proceedings published by all notable CS publishers? The same goes for journals.

ants_everywhere • last Saturday at 3:32 PM

I'm not sure this is the right way to handle it (I don't know what is) but arXiv.org has suffered from poor quality self-promotion papers in CS for a long time now. Years before llms.

➕ show 1 reply

physarum_salad • last Saturday at 3:24 PM

The review paper is dead... so this is a good development. Like you can generate these things in a couple of iterations with AI and minor edits. Preprint servers could be dealing with 1000s of review/position papers over short periods of time and then this wastes precious screening work hours.

It is a bit different in other fields where interpretations or know-how might be communicated in a review paper format that is otherwise not possible. For example, in biology relating to a new phenomena or function.

➕ show 4 replies

kittikitti • yesterday at 8:37 PM

In my experience, arXiv is not a preprint platform. It's a strange gatekeeper of science and should be avoided altogether. They have their favorites which they deem as "high quality" and everything else gets rejected. I am eagerly awaiting for people to dismiss arXiv altogether.

naveen99 • last Saturday at 3:40 PM

Isn’t github the normal way of publishing now for cs ?

➕ show 2 replies

exasperaited • last Saturday at 3:34 PM

The Tragedy of the Commons, updated for LLMs. Part #975 in a continuing series.

These things will ruin everything good, and that is before we even start talking about audio or video.

➕ show 2 replies

an0malous • last Saturday at 4:13 PM

Why not just reject papers authored by LLMs and ban accounts that are caught? arXiv’s management has become really questionable lately, it’s like they’re trying to become a prestigious journal and are becoming the problem they were trying to solve in the first place

➕ show 3 replies

GMoromisato • last Saturday at 5:04 PM

I suspect that LLMs are better at classifying novel vs junk papers than they are at creating novel papers themselves.

If so, I think the solution is obvious.

(But I remind myself that all complex problems have a simple solution that is wrong.)

➕ show 2 replies

beloch • last Saturday at 4:44 PM

A better policy might be for arXiv to do the following:

1. Require LLM produced papers to be attributed to the relevant LLM and not the person who wrote the prompt.

2. Treat submissions that misrepresent authorship as plagiarism. Remove the article, but leave an entry for it so that there is a clear indication that the author engaged in an act of plagiarism.

Review papers are valuable. Writing one is a great way to gain, or deepen, mastery over a field. It forces you to branch out and fully assimilate papers that you may have only skimmed, and then place them in their proper context. Reading quality review papers is also valuable. They're a great way for people new to a field to get up to speed and they can bring things that were missed to the fore, even for veterans of the field.

While the current generation of AI does a poor job of judging significance and highlighting what is actually important, they could improve in the future. However, there's no need for arXiv to accept hundreds of review papers written by the same model on the same field, and readers certainly don't want to sift through them all.

Clearly marking AI submissions and removing credit from the prompters would adequately future-proof things for when, and if, AI can produce high quality review papers. Clearly marking authors who engage in plagiarism as plagiarists will, hopefully, remove most of the motivation to spam arXiv with AI slop that is misrepresented as the work of humans.

My only concern would be for the cost to arXiv of dealing with the inevitable lawsuits. The policy arXiv has chosen is worse for science, but is less likely to get them sued by butt-hurt plagiarists or the very occasional false positive.

➕ show 1 reply

bob1029 • last Saturday at 3:29 PM

> The advent of large language models have made this type of content relatively easy to churn out on demand, and the majority of the review articles we receive are little more than annotated bibliographies, with no substantial discussion of open research issues.

I have to agree with their justification. Since "Attention Is All You Need" (2017) I have seen maybe four papers with similar impact in the AI/ML space. The signal to noise ratio is really awful. If I had to pick a semi-related paper published since 2020 that I actually found interesting, it would have to be this one: https://arxiv.org/abs/2406.19108 I cannot think of a close second right now.

All of the machine learning papers are pure slop to me now. The last one I looked at had an abstract that was so long it put me to sleep. Many of these papers aren't attempting basic decorum anymore. Mandatory peer review would fix a lot of this. I don't think it is acceptable for the staff at arXiv to have to endure a Sisyphean mountain of LLM shit. They definitely need to push back.

➕ show 3 replies

goldenjm • yesterday at 1:26 AM

Their argument in favor of this change seems extremely reasonable and well-explained.

zekrioca • last Saturday at 7:34 PM

Two perspectives: Either (I) LLMs made survey papers irrelevant, or (II) LLMs killed a useful set of arXiv papers.

iberator • last Saturday at 3:35 PM

Simple solution: criminalize posting AI generated publications IF NOT DISCLOSED CLEARLY.

Lets say 50000€ fine, or 1 year in prison. :)

➕ show 2 replies

internetguy • last Saturday at 4:58 PM

This should honestly have been implemented a long time ago. Much of academia is pressured to churn out papers month after month as academia is prioritizing volume over quality or impact.

mottiden • last Saturday at 3:31 PM

I understand their reasoning, but it’s terrible for the CS community not being able to access pre-prints. I hope that a solution can be found.

➕ show 2 replies

ninetyninenine • last Saturday at 3:58 PM

Didn’t realize LLMs were restricted to only CS topics.

Don’t understand why it restricted one category when the problem spans multiple categories.

➕ show 2 replies

j45 • last Saturday at 3:17 PM

Have the papers gotten that good or bad?

➕ show 3 replies

whatever1 • last Saturday at 8:30 PM

The number of content generators is now infinite but the number of content reviewers is the same.

Sorry folks but we lost.

jsrozner • last Saturday at 8:46 PM

I had a convo with a senior CS prof at Stanford two years ago. He was excited about LLM use in paper writing to, e.g., "lower barriers" to idk, "historically marginalized groups" and to "help non-native English speakers produce coherent text". Etc, etc - all the normal tech folk gobbledygook, which tends to forecast great advantage with minimal cost...and then turn out to be wildly wrong.

There are far more ways to produce expensive noise with LLMs than signal. Most non-psychopathic humans tend to want to produce veridical statements. (Except salespeople, who have basically undergone forced sociopathy training.) At the point where a human has learned to produce coherent language, he's also learned lots of important things about the world. At the point where a human has learned academic jargon and mathematical nomenclature, she has likely also learned a substantial amount of math. Few people want to learn the syntax of a language with little underlying understanding. Alas, this is not the case with statistical models of papers!

pwlm • last Saturday at 10:04 PM

"review articles and position papers must now be accepted at a journal or a conference and complete successful peer review."

How will journals or conferences handle AI slop?

ThrowawayTestr • last Saturday at 3:16 PM

This is hilarious. Isn't arXiv the place where everyone uploads their paper?

➕ show 2 replies

hamonrye • yesterday at 7:51 AM

[dead]

anupj • last Saturday at 9:14 PM

[dead]

arendtio • last Saturday at 3:22 PM

I wonder why they can't facilitate LLMs in the review process (like fighting fire with fire). Are even the best models not capable enough, or are the costs too high?

➕ show 2 replies

zackmorris • last Saturday at 3:40 PM

I always figured if I wrote a paper, the peer review would be public scrutiny. As in, it would have revolutionary (as opposed to evolutionary) innovations that disrupt the status quo. I don't see how blocking that kind of paper from arXiv helps hacker culture in any way, so I oppose their decision.

They should solve the real problem of obtaining more funding and volunteers so that they can take on the increased volume of submissions. Especially now that AI's here and we can all be 3 times as productive for the same effort.

➕ show 2 replies

alt Hacker News

Updated practice for review articles and position papers in ArXiv CS category

Comments