In a chat setting you hit the cache every time you add a new prompt: all historical question/an...

qeternity • yesterday at 9:38 PM • 1 reply • view on HN

In a chat setting you hit the cache every time you add a new prompt: all historical question/answer pairs are part of the context and don’t need to be prefilled again.

On the API side imagine you are doing document processing and have a 50k token instruction prompt that you reuse for every document.

It’s extremely viable and used all the time.

Replies

jonhohle • yesterday at 9:45 PM

I’m shocked that this hasn’t been a thing from the start. That seems like table stakes for automating repetitive tasks.

➕ show 1 reply

alt Hacker News

Replies