> I was never under the impression that gaps in conversations would increase costs nor reduce qua...

computably • yesterday at 7:39 PM • 10 replies • view on HN

> I was never under the impression that gaps in conversations would increase costs nor reduce quality. Both are surprising and disappointing.

You didn't do your due diligence on an expensive API. A naïve implementation of an LLM chat is going to have O(N^2) costs from prompting with the entire context every time. Caching is needed to bring that down to O(N), but the cache itself takes resources, so evictions have to happen eventually.

Replies

doesnt_know • yesterday at 8:27 PM

How do you do "due diligence" on an API that frequently makes undocumented changes and only publishes acknowledgement of change after users complain?

You're also talking about internal technical implementations of a chat bot. 99.99% of users won't even understand the words that are being used.

➕ show 4 replies

solarkraft • yesterday at 8:23 PM

I somewhat disagree that this is due diligence. Claude Code abstracts the API, so it should abstract this behavior as well, or educate the user about it.

➕ show 2 replies

someguyiguess • yesterday at 8:24 PM

Yes. It’s perfectly reasonable to expect the user to know the intricacies of the caching strategy of their llm. Totally reasonable expectation.

➕ show 2 replies

margalabargala • yesterday at 9:00 PM

Okay, sure. There's a dollar/intelligence tradeoff. Let me decide to make it, don't silently make Claude dumber because I forgot about a terminal tab for an hour. Just because a project isn't urgent doesn't mean it's not important. If I thought it didn't need intelligence I would use Sonnet or Haiku.

exac • yesterday at 10:07 PM

It is more useful to read posts and threads like this exact thread IMO. We can't know everything, and the currently addressed market for Claude Code is far from people who would even think about caching to begin with.

kang • yesterday at 9:17 PM

It seems you haven't done the due diligence on what part of the API is expensive - constructing a prompt shouldn't be same charge/cost as llm pass.

➕ show 2 replies

kovek • yesterday at 9:05 PM

What if the cache was backed up to cold storage? Instead of having to recompute everything.

bontaq • yesterday at 10:08 PM

How's that O(N^2)? How's it O(N) with caching? Does a 3 turn conversation cost 3 times as much with no caching, or 9 times as much?

➕ show 1 reply

raron • yesterday at 8:29 PM

How big this cached data is? Wouldn't it be possible to download it after idling a few minutes "to suspend the session", and upload and restore it when the user starts their next interaction?

➕ show 4 replies

miroljub • yesterday at 9:55 PM

This sounds like a religious cult priest blaming the common people for not understanding the cult leader's wish, which he never clearly stated.

alt Hacker News

Replies