logoalt Hacker News

getnormalityyesterday at 4:49 PM3 repliesview on HN

I wouldn't be surprised if the headline is accurate, but AI detectors are widely understood to be unreliable, and I see no evidence that this AI detector has overcome the well-deserved stigma.


Replies

maxsperoyesterday at 5:43 PM

Co-founder of Pangram here. Our false positive rate is typically around 1 in 10,000. https://www.pangram.com/blog/all-about-false-positives-in-ai....

We also wanted to quantify our EditLens model's FPR on the same domain, so we ran all of ICLR's 2022 reviews. Of 10,202 reviews, Pangram marked 10,190 as fully human, 10 as lightly AI-edited, 1 as moderately AI-edited, 1 as heavily AI-edited, and none as fully AI-generated.

That's ~1 in 1k FPR for light AI edits, 1 in 10k FPR for heavy AI edits.

show 1 reply
SoftTalkeryesterday at 4:57 PM

In particular, conference papers are already extremely formulaic, organized in a particular way and using a lot of the same stock phrasings and terms of art. AI or not, it's hard to tell them apart.

show 1 reply
Jenssonyesterday at 5:46 PM

The conference papers were 1%, peer reviews 20%, is there another reason for that big difference than more of the peer reviews being AI generated than the papers themselves?

We can't use this to convict a single reviewer, but we can almost surely say that many reviewers just gave the review work to an AI.