Prompt-caching – auto-injects Anthropic cache breakpoints (90% token savings)

66 points • by ermis • today at 11:38 AM • 26 comments • view on HN

Comments

As per its own FAQ this plugin is out of date and doesn’t actually do anything incremental re:caching:

> "Hasn't Anthropic's new auto-caching feature solved this?"

> Largely, yes — Anthropic's automatic caching (passing "cache_control": {"type": "ephemeral"} at the top level) handles breakpoint placement automatically now. This plugin predates that feature and originally filled that gap.

➕ show 2 replies

somesnm • today at 12:15 PM

Hasn't this been largely solved by auto-caching introduced recently by Anthropic, where you pass "cache_control": {"type": "ephemeral"} in your request and it puts breakpoints automatically? https://platform.claude.com/docs/en/build-with-claude/prompt...

➕ show 4 replies

joemazerino • today at 3:39 PM

Firing off cache writing costs 1.2x tokens iirc. Meaning non repeatable tasks will cost more in the long run.

katspaugh • today at 12:35 PM

> This plugin is built for developers building their own applications with the Anthropic API.

> Important note for Claude Code users: Claude Code already handles prompt caching automatically for its own API calls — system prompts, tool definitions, and conversation history are cached out of the box.

Source: their GitHub

➕ show 2 replies

adi_pradhan • today at 12:56 PM

This is applicable only to the API from what i understand. Since claude code already caches quite aggressively (try npx ccusage)

Also the anthropic API did already introduce prompt-caching https://platform.claude.com/docs/en/build-with-claude/prompt...

What is new here?

mijoharas • today at 12:26 PM

I don't understand, Claude code already has automatic prompt caching built in.[0] How does this change things?

[0] https://code.claude.com/docs/en/costs

fschuett • today at 12:51 PM

Slightly off-topic, but I recently tested some tool and it turns out Opus is far cheaper than Sonnet, because it produces way less output tokens and those are what's expensive. It's also much slower than Opus (I did 9 runs to compare Haiku, Sonnet and Opus on the same problem). I also thought "oh, Sonnet is more light-weight and cheaper than Opus", no, that's actually just marketing.

➕ show 1 reply

Felixbot • today at 2:36 PM

The tricky part with Anthropic cache breakpoints is that they only persist for 5 minutes by default unless you are on a higher tier. Worth building in cache-miss fallback logic if you are using this in production -- silent cache expiry can cause unexpected cost spikes.

Slav_fixflex • today at 1:13 PM

Interesting – I've been using Claude heavily for building projects without writing code myself. Token costs add up fast, anything that reduces that is welcome. Has anyone tested this in production workflows?

➕ show 1 reply

spiderfarmer • today at 12:13 PM

Will this work for Cowork as well?

➕ show 2 replies

ermis • today at 11:38 AM

[dead]

alt Hacker News

Prompt-caching – auto-injects Anthropic cache breakpoints (90% token savings)

Comments