logoalt Hacker News

Prompt-caching – auto-injects Anthropic cache breakpoints (90% token savings)

66 pointsby ermistoday at 11:38 AM26 commentsview on HN

Comments

numlockedtoday at 12:57 PM

As per its own FAQ this plugin is out of date and doesn’t actually do anything incremental re:caching:

> "Hasn't Anthropic's new auto-caching feature solved this?"

> Largely, yes — Anthropic's automatic caching (passing "cache_control": {"type": "ephemeral"} at the top level) handles breakpoint placement automatically now. This plugin predates that feature and originally filled that gap.

show 2 replies
somesnmtoday at 12:15 PM

Hasn't this been largely solved by auto-caching introduced recently by Anthropic, where you pass "cache_control": {"type": "ephemeral"} in your request and it puts breakpoints automatically? https://platform.claude.com/docs/en/build-with-claude/prompt...

show 4 replies
joemazerinotoday at 3:39 PM

Firing off cache writing costs 1.2x tokens iirc. Meaning non repeatable tasks will cost more in the long run.

katspaughtoday at 12:35 PM

> This plugin is built for developers building their own applications with the Anthropic API.

> Important note for Claude Code users: Claude Code already handles prompt caching automatically for its own API calls — system prompts, tool definitions, and conversation history are cached out of the box.

Source: their GitHub

show 2 replies
adi_pradhantoday at 12:56 PM

This is applicable only to the API from what i understand. Since claude code already caches quite aggressively (try npx ccusage)

Also the anthropic API did already introduce prompt-caching https://platform.claude.com/docs/en/build-with-claude/prompt...

What is new here?

mijoharastoday at 12:26 PM

I don't understand, Claude code already has automatic prompt caching built in.[0] How does this change things?

[0] https://code.claude.com/docs/en/costs

fschuetttoday at 12:51 PM

Slightly off-topic, but I recently tested some tool and it turns out Opus is far cheaper than Sonnet, because it produces way less output tokens and those are what's expensive. It's also much slower than Opus (I did 9 runs to compare Haiku, Sonnet and Opus on the same problem). I also thought "oh, Sonnet is more light-weight and cheaper than Opus", no, that's actually just marketing.

show 1 reply
Felixbottoday at 2:36 PM

The tricky part with Anthropic cache breakpoints is that they only persist for 5 minutes by default unless you are on a higher tier. Worth building in cache-miss fallback logic if you are using this in production -- silent cache expiry can cause unexpected cost spikes.

Slav_fixflextoday at 1:13 PM

Interesting – I've been using Claude heavily for building projects without writing code myself. Token costs add up fast, anything that reduces that is welcome. Has anyone tested this in production workflows?

show 1 reply
spiderfarmertoday at 12:13 PM

Will this work for Cowork as well?

show 2 replies
ermistoday at 11:38 AM

[dead]