I've yet to see Anthropic promote any sort of token optimization strategy to its users - they always assume we all have infinite inference.
"No bread? Let them eat cake!"
Nah they do. They push Sonnet pretty hard rather than Opus for most tasks.
Also: https://platform.claude.com/docs/en/agents-and-tools/tool-us...
I've noticed that's changed over the past month or so. Claude-code used to happily pipe build commands straight into context, but recently it's been running them as background tasks that pipe to file, and it'll search and do partial reads on the output instead.
It also gives tips on reducing context size when you run /context .
Presumably they are actually starting to feel the pinch on inference costs themselves with what still feels like a fairly generous max plan.
Not sure how you use CC, but the last 6 months has felt like significant optimization efforts to me. Last year Claude would just read and edit files, now it's all kinds of basic tool gymnastics with grep/awk/sed/etc to narrowly slice and avoid token-heavy reads. Resuming sessions that aren't even that large get a scary prompt about using a significant portion of your token budget if you continue without compacting.
To me it feels like a worse experience, and they probably feel it too, but it makes sense from an optimization perspective. I've probably learned some shell tricks, but also going blind from watching Claude try dozens of variations of some multi-line chained and piped wall of bash nightmare, instead of just reading a few files.