I am not sure if you are familiar with Pangram (co-founder here) but we are a group of research scie...

maxspero • yesterday at 5:52 PM • 8 replies • view on HN

I am not sure if you are familiar with Pangram (co-founder here) but we are a group of research scientists who have made significant progress in this problem space. If your mental model of AI detectors is still GPTZero or the ones that say the declaration of independence is AI, then you probably haven't seen how much better they've gotten.

This paper by economists from the University of Chicago economists found zero false positives of 1,992 human-written documents and over 99% recall in detecting AI documents. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5407424

Replies

nialse • yesterday at 6:10 PM

Nothing points out that the benchmark is invalid like a zero false positive rate. Seemingly it is pre-2020 text vs a few models rework of texts. I can see this model fall apart in many real world scenarios. Yes, LLMs use strange language if left to their own devices and this can surely be detected. 0% false positive rate under all circumstances? Implausible.

➕ show 2 replies

bonsai_spool • yesterday at 7:06 PM

                          EditLens (Ours)
                   Predicted Label
                Human     Mix       AI
             ┌─────────┬─────────┬─────────┐
       Human │  1770   │   111   │    0    │
             ├─────────┼─────────┼─────────┤
 True  Mix   │   265   │  1945   │   28    │
 Label       ├─────────┼─────────┼─────────┤
         AI  │    0    │   186   │  1695   │
             └─────────┴─────────┴─────────┘

It looks like 5% of human texts from your paper are marked as mixed, and mixed texts are 5-10% if mixed texts as AI, from your paper.

I guess I don’t see that this is much better than what’s come before, using your own paper.

Edit: this is an irresponsible Nature news article, too - we should see a graph of this detector over the past ten years to see how much of this ‘deluge’ is algorithmic error

lifthrasiir • yesterday at 6:57 PM

It is not wise to brag about your product when the GP is pointing out that the article "reads like PR for Pangram", no matter AI detectors are reliable or not.

➕ show 1 reply

ugh123 • yesterday at 10:38 PM

How do you discern between papers "completely fabricated" by AI vs. edited by AI for grammar?

rs186 • yesterday at 6:24 PM

The response would be more helpful if it directly addresses the arguments in posts from that search result.

➕ show 1 reply

ThrowawayTestr • yesterday at 8:46 PM

Are you concerned with your product being used to improve AI to be less detectable?

➕ show 2 replies

jay_kyburz • yesterday at 6:37 PM

I thought the author was attempting to highlight the hypocrisy of using an AI to detect other uses of AI, as if one was a good use, and the other bad.

moffkalast • yesterday at 6:24 PM

I see the bullshit part continues on the PR side as well, not just in the product.

alt Hacker News

Replies