Agent loops (particularly coding agents) have a huge amount of repetition, because the entire contex...

jcparkyn • today at 11:11 AM • 0 replies • view on HN

Agent loops (particularly coding agents) have a huge amount of repetition, because the entire context is included in every model request. So long as it's at the start of the input and doesn't change, it will be able to hit the KV cache (assuming the model provider actually has the prefix in cache).

This only works because prompt caching is done by matching prefixes, not the entire input.

alt Hacker News