I seem to remember Yahoo finance (I think it was them, maybe someone else) introducing benign errors into their market data feeds, to prevent scraping. This lead to people doing 3 requests instead of just 1, to correct the errors, which was very expensive for them, so they turned it off.
I don't think watermarking is a winning game for the watermarker, with enough copies any errors can be cancelled.
> I don't think watermarking is a winning game for the watermarker, with enough copies any errors can be cancelled.
This is a very common assumption that turns out to be false.
There are Tardos probabilistic codes (see the paper I linked) which have the watermark scale as the square of the traitor count.
For example, with a watermark of just 400 bits, 4 traitors (who try their best to corrupt the watermark) will stand out enough to merit investigation and with 800 bits be accused without any doubt. This is for a binary alphabet, with text you can generate a bigger alphabet and have shorter watermarks.
These are typically intended for tracing pirated content, so they carry the so-called Marking Assumption (if given two or more versions of a piece of content, you must choose one. A pirate isn't going to corrupt or remove a piece of video, that would be unsuitable for leaking). So it would likely be possible to get better results with documents, may require larger watermarks to get such traitors reliably.