logoalt Hacker News

wolttamyesterday at 6:17 PM1 replyview on HN

It depends on the use-case. yes, 90% of cost is cache in agentic coding scenarios (actually 95% in my experience). But not when the model reasons for 200k+ tokens before answering a complex problem.


Replies

himata4113yesterday at 6:32 PM

gemini models solve a problem in 80% less tokens so that's something to think about.

show 1 reply