Most interesting about this post is how easy it seems for OpenAI to do analysis on basically all cha...

goobatrooba • today at 7:16 AM • 2 replies • view on HN

Most interesting about this post is how easy it seems for OpenAI to do analysis on basically all chats ever made. They don't qualify exactly what data they analysed but seem to be confident in statements like 0.12% of all queries contained this word. So everything is saved. Long-term. Fully accessible.

As this all seems so straightforward I would be surprised if anything is anonymised or otherwise sanitised to preserve privacy or user's secrets.

Replies

lionkor • today at 7:38 AM

Yes, of course. Every single bit of data you send to OpenAI is stored, catalogued, indexed, analayzed, and trained on. It'll simply be a "oops, we miscatalogued and accidentally trained GPT 6 on all data, not just data we got consent for".

If you think "wait, that's illegal"--so is the initial training on stolen data lol

➕ show 3 replies

upbeat_general • today at 7:23 AM

Sampling exists.

➕ show 1 reply

alt Hacker News

Replies