It is an _incredible_ stretch to frame certificate transparency logs as "content" in the c...

jfindper • last Monday at 4:12 PM • 3 replies • view on HN

It is an _incredible_ stretch to frame certificate transparency logs as "content" in the creative sense.

The whole purpose of this data is to be consumed by 3rd-parties.

Replies

I don't see issue with OAI scraping public logs.

But what GP probably meant is that OAI definitely uses this log to get a list of new websites in order to scrap then later. This is a pretty standard way to use CT logs - you get a list of domains to scrap instead of relying solely on hyperlinks.

ang_cire • last Monday at 7:14 PM

I think their point is that the people registering certs may not intend their sites to be immediately scraped, but now OpenAI is bypassing e.g. google indexing or web spidering, and using your cert provider's CT entries to find you immediately for scraping.

advisedwang • last Monday at 5:58 PM

matt3210 clearly means that the content of the website (revealed by the CT log) is what is being stolen, not the data in the CT log

alt Hacker News

Replies