logoalt Hacker News

planckscnsttoday at 2:00 AM2 repliesview on HN

I'm working on lots of projects. My favorite is what I call "context bonsai" where I'm giving LLM harnesses the ability to surgically edit the context. It's available as a tool. You can say "remove that failed debugging session and write a summary of what we learned." Or you can take a more hands-on approach and say "remove messages msg_ID1 through msg_ID2". The removal leaves a summary and keywords, and the original messages can be pulled back into context if the LLM thinks they're useful.

I would really like people to try it out and report bugs, failures, and successes.

https://github.com/Vibecodelicious/opencode/blob/surgical_co...

I'm currently trying to get the LLM to be more proactive about removing content that is no longer useful in order to stay ahead of autocompaction and also just to keep the context window small and focused in general.


Replies

robvirentoday at 3:44 AM

I find it fascinating to give the LLMs huge stacks of reflective context. It's incredible how good they are at feeling huge amounts of csv like data. I imagine they would be good at trimming their context down.

I did some experiments by exposing the raw latent states, using hooks, of a small 1B Gemma model to a large model as it processed data. I'm curious if it is possible for the large model to nudge the smaller model latents to get the outputs it wants. I desperately want to get thinking out of tokens and into latent space. Something I've been chasing for a bit.

show 1 reply
maven29today at 5:20 AM

I'm sure you're aware but it's worth pointing out that you will lose all your cache hit discounts with some providers. The next turn will incur the cost of the whole trajectory billed at fresh input token rates.

As an aside, 95 pages into the system card for Claude Opus 4.6, Anthropic acknowledges that they have disabled prompt prefill.

show 2 replies