> Unfortunately, the price LLM companies would have to pay to scrape every single Anubis deployme...

uqers • yesterday at 5:25 AM • 4 replies • view on HN

> Unfortunately, the price LLM companies would have to pay to scrape every single Anubis deployment out there is approximately $0.00.

The math on the site linked here as a source for this claim is incorrect. The author of that site assumes that scrapers will keep track of the access tokens for a week, but most internet-wide scrapers don't do so. The whole purpose of Anubis is to be expensive for bots that repeatedly request the same site multiple times a second.

Replies

drum55 • yesterday at 5:43 AM

The "cost" of executing the JavaScript proof of work is fairly irrelevant, the whole concept just doesn't make sense with a pessimistic inspection. Anubis requires the users to do an irrelevant amount of sha256 hashes in slow javascript, where a scraper can do it much faster in native code; simply game over. It's the same reason we don't use hashcash for email, the amount of proof of work a user will tolerate is much lower than the amount a professional can apply. If this tool provides any benefit, it's due to it being obscure and non standard.

When reviewing it I noticed that the author carried the common misunderstanding that "difficulty" in proof of work is simply the number of leading zero bytes in a hash, which limits the granularity to powers of two. I realize that some of this is the cost of working in JavaScript, but the hottest code path seems to be written extremely inefficiently.

    for (; ;) {
        const hashBuffer = await calculateSHA256(data + nonce);
        const hashArray = new Uint8Array(hashBuffer);

        let isValid = true;
        for (let i = 0; i < requiredZeroBytes; i++) {
          if (hashArray[i] !== 0) {
            isValid = false;
            break;
          }
        }

It wouldn’t be exaggerating to say that a native implementation of this with even a hair of optimization could reduce the “proof of work” to being less time intensive than the ssl handshake.

➕ show 4 replies

tptacek • yesterday at 5:31 AM

Right, but that's the point. It's not that the idea is bad. It's that PoW is the wrong fit for it. Internet-wide scrapers don't keep state? Ok, then force clients to do something that requires keeping state. You don't need to grind SHA2 puzzles to do that; you don't need to grind anything at all.

valicord • yesterday at 5:32 AM

The point is that the scrapers can easily bypass this if they cared to do so

➕ show 1 reply

alt Hacker News

Replies