logoalt Hacker News

Jgrubbtoday at 11:39 AM2 repliesview on HN

The tokens are still being burnt, they're just doing so in a parallel dimension from the users main context window.


Replies

ajmurmanntoday at 1:11 PM

It's true that the initial tool response still has the same amount of tokens but it doesn't keep dragged along in the longer-lived top context.

ViewTrick1002today at 11:45 AM

The real benefit is being able to use a cheaper, but good enough, model with a specific system prompt dedicated to that task.