No? It's not because it's a cache, it's because they're scared of letting you se...

3836293648 • yesterday at 8:00 PM • 3 replies • view on HN

No? It's not because it's a cache, it's because they're scared of letting you see the thinking trace. If you got the trace you could just send it back in full when it got evicted from the cache. This is how open weight models work.

Replies

mpyne • yesterday at 10:12 PM

The trace goes back fine, that's not the issue.

The issue is that if they send the full trace back, it will have to be processed from the start if the cache expired, and doing that will cause a huge one-time hit against your token limit if the session has grown large.

So what Boris talked about is stripping things out of the trace that goes back to regenerate the session if the cache expires. Doing this would help avert burning up the token limit, but it is technically a different conversation, so if CC chooses poorly on stripping parts of the context then it would lead to Claude getting all scatter-brained.

➕ show 1 reply

reactordev • yesterday at 8:15 PM

They are sending it back to the cache, the part you are missing is they were charging you for it.

➕ show 1 reply

eknkc • yesterday at 8:09 PM

I’m not familiar with the Claude API but OpenAI has an encrypted thking messages option. You get something that you can send back but it is encrypted. Not available on Anthropic?

alt Hacker News

Replies