This sounds like one of those problems where the solution is not a UX tweak but an architecture chan...

Joeri • yesterday at 7:39 PM • 2 replies • view on HN

This sounds like one of those problems where the solution is not a UX tweak but an architecture change. Perhaps prompt cache should be made long term resumable by storing it to disk before discarding from memory?

Replies

kivle • yesterday at 8:25 PM

I agree.. Maybe parts of the cache contents are business secrets.. But then store a server side encrypted version on the users disk so that it can be resumed without wasting 900k tokens?

slashdave • yesterday at 10:23 PM

Disk where? LLM requests are routed dynamically. You might not even land in the same data center.

➕ show 1 reply

alt Hacker News

Replies