logoalt Hacker News

computablyyesterday at 7:39 PM10 repliesview on HN

> I was never under the impression that gaps in conversations would increase costs nor reduce quality. Both are surprising and disappointing.

You didn't do your due diligence on an expensive API. A naïve implementation of an LLM chat is going to have O(N^2) costs from prompting with the entire context every time. Caching is needed to bring that down to O(N), but the cache itself takes resources, so evictions have to happen eventually.


Replies

doesnt_knowyesterday at 8:27 PM

How do you do "due diligence" on an API that frequently makes undocumented changes and only publishes acknowledgement of change after users complain?

You're also talking about internal technical implementations of a chat bot. 99.99% of users won't even understand the words that are being used.

show 4 replies
solarkraftyesterday at 8:23 PM

I somewhat disagree that this is due diligence. Claude Code abstracts the API, so it should abstract this behavior as well, or educate the user about it.

show 2 replies
someguyiguessyesterday at 8:24 PM

Yes. It’s perfectly reasonable to expect the user to know the intricacies of the caching strategy of their llm. Totally reasonable expectation.

show 2 replies
margalabargalayesterday at 9:00 PM

Okay, sure. There's a dollar/intelligence tradeoff. Let me decide to make it, don't silently make Claude dumber because I forgot about a terminal tab for an hour. Just because a project isn't urgent doesn't mean it's not important. If I thought it didn't need intelligence I would use Sonnet or Haiku.

exacyesterday at 10:07 PM

It is more useful to read posts and threads like this exact thread IMO. We can't know everything, and the currently addressed market for Claude Code is far from people who would even think about caching to begin with.

kangyesterday at 9:17 PM

It seems you haven't done the due diligence on what part of the API is expensive - constructing a prompt shouldn't be same charge/cost as llm pass.

show 2 replies
kovekyesterday at 9:05 PM

What if the cache was backed up to cold storage? Instead of having to recompute everything.

bontaqyesterday at 10:08 PM

How's that O(N^2)? How's it O(N) with caching? Does a 3 turn conversation cost 3 times as much with no caching, or 9 times as much?

show 1 reply
raronyesterday at 8:29 PM

How big this cached data is? Wouldn't it be possible to download it after idling a few minutes "to suspend the session", and upload and restore it when the user starts their next interaction?

show 4 replies
miroljubyesterday at 9:55 PM

This sounds like a religious cult priest blaming the common people for not understanding the cult leader's wish, which he never clearly stated.