logoalt Hacker News

hephaes7usyesterday at 11:04 PM1 replyview on HN

Why do you even necessarily think that wouldn't happen?

As I understand it, we'd essentially be relying on something like an mp3 compression algorithm to fail to capture a particular, subtle transient -- the lossy nature itself is the only real protection.

I agree that it's vanishingly unlikely if one person includes a sensitive document in their context, but what if a company has a project context which includes the same document in 10,000 chats? Maybe then it's more much likely that whatever private memo could be captured in training...


Replies

simonwyesterday at 11:07 PM

I did get an answer from a senior executive at one AI lab who called this the "regurgitation problem" and said that they pay very close attention to it, to the point that they won't ship model improvements if they are demonstrated to cause this.

show 1 reply