logoalt Hacker News

alphazardtoday at 12:29 AM1 replyview on HN

I seem to remember Yahoo finance (I think it was them, maybe someone else) introducing benign errors into their market data feeds, to prevent scraping. This lead to people doing 3 requests instead of just 1, to correct the errors, which was very expensive for them, so they turned it off.

I don't think watermarking is a winning game for the watermarker, with enough copies any errors can be cancelled.


Replies

coppsilgoldtoday at 12:35 AM

> I don't think watermarking is a winning game for the watermarker, with enough copies any errors can be cancelled.

This is a very common assumption that turns out to be false.

There are Tardos probabilistic codes (see the paper I linked) which have the watermark scale as the square of the traitor count.

For example, with a watermark of just 400 bits, 4 traitors (who try their best to corrupt the watermark) will stand out enough to merit investigation and with 800 bits be accused without any doubt. This is for a binary alphabet, with text you can generate a bigger alphabet and have shorter watermarks.

These are typically intended for tracing pirated content, so they carry the so-called Marking Assumption (if given two or more versions of a piece of content, you must choose one. A pirate isn't going to corrupt or remove a piece of video, that would be unsuitable for leaking). So it would likely be possible to get better results with documents, may require larger watermarks to get such traitors reliably.

show 1 reply