logoalt Hacker News

maxsperoyesterday at 5:52 PM8 repliesview on HN

I am not sure if you are familiar with Pangram (co-founder here) but we are a group of research scientists who have made significant progress in this problem space. If your mental model of AI detectors is still GPTZero or the ones that say the declaration of independence is AI, then you probably haven't seen how much better they've gotten.

This paper by economists from the University of Chicago economists found zero false positives of 1,992 human-written documents and over 99% recall in detecting AI documents. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5407424


Replies

nialseyesterday at 6:10 PM

Nothing points out that the benchmark is invalid like a zero false positive rate. Seemingly it is pre-2020 text vs a few models rework of texts. I can see this model fall apart in many real world scenarios. Yes, LLMs use strange language if left to their own devices and this can surely be detected. 0% false positive rate under all circumstances? Implausible.

show 2 replies
bonsai_spoolyesterday at 7:06 PM

                          EditLens (Ours)
                   Predicted Label
                Human     Mix       AI
             ┌─────────┬─────────┬─────────┐
       Human │  1770   │   111   │    0    │
             ├─────────┼─────────┼─────────┤
 True  Mix   │   265   │  1945   │   28    │
 Label       ├─────────┼─────────┼─────────┤
         AI  │    0    │   186   │  1695   │
             └─────────┴─────────┴─────────┘

It looks like 5% of human texts from your paper are marked as mixed, and mixed texts are 5-10% if mixed texts as AI, from your paper.

I guess I don’t see that this is much better than what’s come before, using your own paper.

Edit: this is an irresponsible Nature news article, too - we should see a graph of this detector over the past ten years to see how much of this ‘deluge’ is algorithmic error

lifthrasiiryesterday at 6:57 PM

It is not wise to brag about your product when the GP is pointing out that the article "reads like PR for Pangram", no matter AI detectors are reliable or not.

show 1 reply
ugh123yesterday at 10:38 PM

How do you discern between papers "completely fabricated" by AI vs. edited by AI for grammar?

rs186yesterday at 6:24 PM

The response would be more helpful if it directly addresses the arguments in posts from that search result.

show 1 reply
ThrowawayTestryesterday at 8:46 PM

Are you concerned with your product being used to improve AI to be less detectable?

show 2 replies
jay_kyburzyesterday at 6:37 PM

I thought the author was attempting to highlight the hypocrisy of using an AI to detect other uses of AI, as if one was a good use, and the other bad.

moffkalastyesterday at 6:24 PM

I see the bullshit part continues on the PR side as well, not just in the product.