Is there actually a reliable way to differentiate human from bot?

ryan_n • today at 5:29 PM • 3 replies • view on HN

Replies

As I understand it as models driving agent behavior of headless browsers are getting more and more sophisticated it's getting harder to reliably predict.

The same way LLM's without watermarking cannot be reliably classified as "not-human" neural-network driven scraping tools are getting harder to detect.

Cloudflare, and DataDome position themselves as companies that can detect automated traffic using things like IP reputation, behavioral signals, timing... But these things can be faked through proxy-networks, human behavior signals can be imitated with generative AI the same way text can be, web bots can utilize neural networks to generate trajectories and timings similar to those of humans.

If you can have an AI use a browser the same way a human can how can you distinguish the two?

mpeg • today at 6:04 PM

There are reliable ways of differentiating human from cheap, bulk scraping bots.

But if the bot is advanced / expensive enough, it gets a lot harder. Where this product's market sits is in giving a paid way to access content compared to having to spin up bots that run js, from real IP addresses, etc. all of which are more expensive

➕ show 1 reply

ihsw • today at 5:36 PM

[dead]

alt Hacker News

Replies