Well, LLMs are priced per token, and most of the tokens are just echoing back the old code with mini...

sysmax • today at 5:06 PM • 2 replies • view on HN

Well, LLMs are priced per token, and most of the tokens are just echoing back the old code with minimal changes. So, a lot of the cost is actually paying for the LLM to echo back the same code.

Except, it's not that trivial to solve. I tried experimenting with asking the model to first give a list of symbols it will modify, and then just write the modified symbols. The results were OK, but less refined than when it echoes back the entire file.

The way I see it is that when you echo back the entire file, the process of thinking "should I do an edit here" is distributed over a longer span, so it has more room to make a good decision. Like instead of asking "which 2 of the 10 functions should you change" you're asking it "should you change method1? what about method2? what about method3?", etc., and that puts less pressure on the LLM.

Except, currently we are effectively paying for the LLM to make that decision for *every token*, which is terribly inefficient. So, there has to be some middle ground between expensively echoing back thousands of unchanged tokens and giving an error-ridden high-level summary. We just haven't found that middle ground yet.

Replies

mmastrac • today at 5:08 PM

I think the ideal way for these LLMs to work will be using AST-level changes instead of "let me edit this file".

grit.io was working on this years ago, not sure if they are still alive/around, but I liked their approach (just had a very buggy transformer/language).

gruez • today at 5:07 PM

>and most of the tokens are just echoing back the old code with minimal changes

I thought coding harnesses provided tools to apply diffs so the LLM didn't have to echo back the entire file?

➕ show 1 reply

alt Hacker News

Replies