The most egregious thing Perplexity did was to straight up ignore robots.txt. Cloudflare promise not to do that, so if we take their word for it, it's a quite different setup.
That said, I'm not fan of letting users forge whatever user agents they please. Instead, AIUI to opt-out of getting crawled I have to look for the existence of certain request headers[1].
The most egregious thing Perplexity did was to straight up ignore robots.txt. Cloudflare promise not to do that, so if we take their word for it, it's a quite different setup.
That said, I'm not fan of letting users forge whatever user agents they please. Instead, AIUI to opt-out of getting crawled I have to look for the existence of certain request headers[1].
[1]: https://developers.cloudflare.com/browser-rendering/referenc...