I would think the most obvious explanation is that they are used as part of the watermark to help Op...

sixhobbits • yesterday at 9:38 AM • 2 replies • view on HN

I would think the most obvious explanation is that they are used as part of the watermark to help OpenAI identify text - i.e. the model isn't doing it at all but final-pass process is adding in statistical patterns on top of what the model actually generates (along with words like 'delve' and other famous GPT signatures)

I don't have evidence that that's true, but it's what I assume and I'm surprised it's not even mentioned as a possibility.

When I studied author profiling, I built models that could identify specific authors just by how often they used very boring words like 'of' and 'and' with enough text, so I'm assuming that OpenAI plays around with some variables like that which would much harder to humans to spot, but probably uses several layers of watermarking to make it harder to strip, which results in some 'obvious' ones too.

Replies

constantius • yesterday at 10:59 AM

Obvious watermarking that consistently gets a lot of hate from vocal minorities (devs, journalists, etc.) would probably be simply removed for the benefit of those other layers you mention.

But the watermarking layers is a fascinating idea (and extremely likely to exist), thanks!

xandrius • yesterday at 9:47 AM

Honestly the most obvious explanation is that the training set has a lot of them, not some sort of watermarking conspiracy. Occam's razor at its best.

alt Hacker News

Replies