I think that you send the entire conversation with every request.
As long as you stay under the 1-hour caching TTL for your open threads, I guess your marginal cost is linear.
This is me on a weekday flicking between Ghostty tabs to enter “stand by” every ~45 mins.
As long as you stay under the 1-hour caching TTL for your open threads, I guess your marginal cost is linear.
This is me on a weekday flicking between Ghostty tabs to enter “stand by” every ~45 mins.