>This is really fascinating to me. I was reading this article and originally agreed with you, "I mean, under the covers it's got to be converting to text tokens at some point, so there is no way it's actually cheaper for Claude itself to execute."
It'd be weird if they were doing this, since it would mean the context window size was a lie and that the API would presumably reject requests whose expanded form went over the 1m limit. For someone using pxpipe with an effective context compression of 90% in some instances, it'd hit the limit at barely 100k.