Yes. Natively, the models are limited to 200k tokens which is on the order of dozens of files, which...

jahooma • 11/07/2024 • 3 replies • view on HN

Yes. Natively, the models are limited to 200k tokens which is on the order of dozens of files, which is way too small.

But Codebuff has a whole preliminary step where it searches your codebase to find relevant files to your query, and only those get added to the coding agent's context.

That's why I think it should work up to medium-large codebases. If the codebase is too large, then our file-finding step will also start to fail.

I would give it a shot on your codebase. I think it should work.

Replies

cratermoon • 11/07/2024

RAG is a well-known technique now, and to paraphrase Emily Bender[1], here are some reasons why it's not a solution.

The code extruded from the LLM is still synthetic code, and likely to contain errors both in the form of extra tokens motivated by the pre-training data for the LLM rather than the input texts AND in the form of omission. It's difficult to detect when the summary you are relying on is actually missing critical information.

Even if the set up includes the links to the retrieved documents, the presence of the generated code discourages users from actually drilling down and reading them.

This is still a framing that says: Your question has an answer, and the computer can give it to you.

1 https://buttondown.com/maiht3k/archive/information-literacy-...

➕ show 1 reply

craigds • 11/07/2024

I'll need to get approval to use this on that codebase. I've tried it out on a smaller open-source codebase as a first step.

For anyone interested:

  - here's the Codebuff session: https://gist.github.com/craigds/b51bbd1aa19f2725c8276c5ad36947e2
  - The result was this PR: https://github.com/koordinates/kart/pull/1011

It required a bit of back and forth to produce a relatively small change, and I think it was a bit too narrow with the files it selected (it missed updating the implementations of a method in some subclasses, since it didn't look at those files)

So I'm not sure if this saved me time, but it's nevertheless promising! I'm looking forward to what it will be capable of in 6mo.

asattarmd • 11/07/2024

What's the fundamental limitation to context size here? Why can't a model be fine-tuned per codebase, taking the entire code into context (and be continuously trained as it's updated)?

Forgive my naivety, I don't now anything about LLMs.

alt Hacker News

Replies