The Science of Detecting LLM-Generated Text

26 points • by vinhnx • today at 2:09 AM • 9 comments • view on HN

Comments

This is an article from 2024, when open weights models like llama were only beginning to emerge. With those you basically cannot reliably do any detection (as the authors admit by the end).

Which is really boiling down to text having statistically very similar properties to human generated one. Introduce a more motivated attacker and the text would be indistinguishable from real (with occasional typos, no use of "delve", "it's not x its y", emdashes and so on).

It really is a lost battle: you cannot embed extra information in the text that will survive even basic postprocessing (in contrast to, say, steganography)

➕ show 2 replies

Akranazon • today at 8:26 AM

Detecting LLM-generated text is basically solved by modern watermarking techniques (https://arxiv.org/abs/2306.09194). However, the main trouble with watermark-based approaches is that you have to get every LLM provider to adopt it. A student trying to cheat could always opt for some open-weight Chinese model, if the word spreads that the major providers are compromised.

➕ show 1 reply

nextzck • today at 8:14 AM

I built one and it’s open source https://github.com/johnzfitch/specho-v2 last I checked 141D or something but I might have left some dimensions behind at the original specHO repo. I simplified the architecture in v2. That’s essentially how you scale it. Oddly enough the easiest model to predict was humans. Almost like a baseline. See my archived v1 repo for more of the nuts and bolts.

giancarlostoro • today at 5:46 AM

I see a lot of people claiming just about everything is AI these days, including totally normal videos, photos and text. I'm not sure what the solution will be to this phenomena but we're in for a bit of trouble for a while.

wps • today at 8:02 AM

Detection methods only serve to stop the most blatant, low effort kind of LLM responses. The more pressing issue is that people are reading LLM output, and paraphrasing it for their assignments, reports, emails, etc. The obvious problem being that LLMs are often wrong, or miss nuance in unnoticeable ways for the laymen. The secondary problem is the general outsourcing of thinking and effort, even for tasks that you ought to give your focus to. BTW: from my anecdata, most university students are absolutely violating academic integrity with these tools, and have completely lost the ability to engage without them.

➕ show 1 reply

alt Hacker News

The Science of Detecting LLM-Generated Text

Comments