There are dozens of first generation AI detectors and they all suck. I'm not going to defend them. Most of them use perplexity based methods, which is a decent separators of AI and human text (80-90%) but has flaws that can't be overcome and high FPRs on ESL text.
https://www.pangram.com/blog/why-perplexity-and-burstiness-f...
Pangram is fundamentally different technology, it's a large deep learning based model that is trained on hundreds of millions of human and AI examples. Some people see a dozen failed attempts at a problem as proof that the problem is impossible, but I would like to remind you that basically every major and minor technology was preceded by failed attempts.
GAN.. Just feed the output of your algorithms back into the LLM while learning. At the end of the day the problem is impossible, but we're not there yet.
Can your software detect which LLMs most likely generated a text?
Some people see a dozen extremely profitable, extremely destructive attempts at a problem as proof that the problem is not a place for charitable interpretation.