It’s not a loophole, it just happens that encoding information as optical tokens is much more efficient than text.
That's not what is happening. Claude isn't charging for the tokens it generates from the OCR on its side, but it's still processing the same number of tokens as if you had sent the text, just with the extra step of OCR on Claude's side. This is 100% a loophole that's burning extra resources.
Truly a picture is worth a thousand words.
> encoding information as optical tokens
Educate me: what is an "optical token" when dealing with LLMs?
Of course it isn't
A text encoding uses 8bits per character on average, tokenization further compresses that
An image font would be 25 bits if 5x5, and most fonts are 12 pixels high
Of course it isn't efficient, this is a pricing inefficiency and a hack to exploit it (even the author describes it as an exploit)
Anyone else laugh out loud when they read this? Like, okay so NO, that's entirely impossible. What's really going on?
Step back and think about it another way - ask which scenario is more likely:
Some random person discovered a 60% across the board gain in all LLMs, using an extremely simple trick that none of the labs noticed in all these years. That trick being to rasterize 8bit characters into 8x8 pixels in a big image. 60% in a market worth trillions of dollars.
or
Anthropic's marketing team arbitrarily prices tokens to drive growth, according to vibes and feelings, and didn't think they needed to price images on par with text in their rush to burn cash & drive growth. Some folks take advantage of the trick during the first few days of the model's availability before Anthopic corrects their pricing, to align more proportionally with actual compute costs.