logoalt Hacker News

michaelmioryesterday at 11:50 PM2 repliesview on HN

> I'm surprised that Cloudflare hasn't started hosting a pre-scraped version of websites that use Cloudflare's proxy

It's entirely possible that they're doing this under the hood for cases where they can clearly identify the content they have cached is public.


Replies

janalsncmtoday at 12:27 AM

How would they know the content hasn’t changed without hitting the website?

show 3 replies
binarymaxtoday at 12:11 AM

Based on the post, it seems likely that they'd just delay per the robots.txt policy no matter what, and do a full browser render of the cached page to get the content. Probably overkill for lots and lots of sites. An HTML fetch + readability is really cheap.