logoalt Hacker News

skeptic_aiyesterday at 11:30 PM1 replyview on HN

Don’t forget context. Basically I have 2 billion input and 1 million output. Every prompt you do, sends back the whole thing again and again. Let’s say you have 500k context used, you send 10 messages is 5 million. 100 messages 50 million. Use 5 threats is 250 million.


Replies

consumer451yesterday at 11:56 PM

But how is it even possible (bad harness?), or wise, to send 500k or 1M tokens per call? Regarding cache, how are you not hitting the 1hr cache? Also, start new chats early and often!

I have been "agentic coding" since Sonnet 3.5 and after this paper came out, it became my bible:

https://github.com/adobe-research/NoLiMa

Last I checked, all models suck as you fill the context window. "Context engineering" is how you do this whole thing.