People just confidently stating stuff like "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study. It's so divorced from my experience of these tools, I genuinely don't really understand how my experience can be so far from yours, unless "basically" is doing a lot of heavy lifting here.
As far as I understand, Turing himself did not specify a duration, but here's an example paper that ran a randomized study on (the old) GPT 4 with a 5 minute duration, and the AI passed with flying colors - https://arxiv.org/abs/2405.08007
From my experience, AI has significantly improved since, and I expect that ChatGPT o3 or Claude 4 Opus would pass a 30 minute test.
Per the wiki article for Turing Test:
> In the test, a human evaluator judges a text transcript of a natural-language conversation between a human and a machine. The evaluator tries to identify the machine, and the machine passes if the evaluator cannot reliably tell them apart. The results would not depend on the machine's ability to answer questions correctly, only on how closely its answers resembled those of a human.
Based on this, I would agree with the OP in many contexts. So, yeah, 'basically', is a load bearing word here but seems reasonably correct in the context of distinguishing human vs bot in any scalable and automated way.
Here's three comments, two were written by a human and one written by a bot - can you tell which were human and which were a bot?
Didn’t realize plexiglass existed in the 1930s!
I'm certainly not a monetization expert. But don't most consumers recoil in horror at subscriptions? At least enough to offset the idea they can be used for everything?
Not sure why this isn’t getting more attention - super helpful and way better than I expected!
Well, LLMs do pass the Turing Test, sort of.
I have seen data from an AI call center that shows 70% of users never suspected they spoke to an AI
It can't mimic a human over the long term. It can solve a short, easy-for-human CAPTCHA.
> "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study.
I think you may think passing the Turing test is more difficult and meaningful than it is. Computers have been able to pass the Turing test for longer than genAI has been around. Even Turing thought it wasn't a useful test in reality. He meant it as a thought experiment.