I'm aware of the _why_ but this is why the tools aren't useful for my case. If they cannot...

tauoverpi • 05/15/2025 • 1 reply • view on HN

I'm aware of the _why_ but this is why the tools aren't useful for my case. If they cannot consume the codebase in a reasonable amount of time and provide value from that then they generally aren't useful in areas where I would want to use them (navigating large codebases). If the codebase is relatively small or the problem is known then an LLM is not any better than tab-complete and arguably worse in many cases as the generated result has to be parsed and added to my mental model of the problem rather than the mental model being constructed while working on the code itself.

I guess my point is, I have no use for LLMs in their current state.

> That's because whatever training the model had, it didn't covered anything remotely similar to the codebase you worked on. > We get this issue even with obscure FLOSS libraries.

This is the issue however as unfamiliar codebases is exactly where I'd want to use such tooling. Not working in those cases makes it less than useful.

> Unless you provide them with context or instruct them not to make up stuff, they will resort to bullshit their way into an example.

In all cases context was provided extensively but at some point it's easier to just write the code directly. The context is in surrounding code which if the tool cannot pick up on that when combined with direction is again less than useful.

> What's truly impressive about this is that often times the hallucinated code actually works.

I haven't experienced the same. It fails more often than not and the result is much worse than the hand-written solution regardless of the level of direction. This may be due to unfamiliar code but again, if code is common then I'm likely familiar with it already thus lowering the value of the tool.

> Again,this suggest a failure on your side for not providing any context.

This feels like a case of blaming the user without full context of the situation. There are comments, the names are descriptive and within reason, and there's annotation of why certain things are done the way they are. The purpose of a doc comment is not "this does X" but rather _why_ you want to use this function and it's purpose which is something LLMs struggle to derive from my testing of them. Adding enough direction to describe such is effectively writing the documentation with a crude english->english compiler between. This is the same problem with unit test generation where unit tests are not to game code coverage but to provide meaningful tests of the domain and known edge cases of a function which is again something the LLM struggles with.

For any non-junior task LLM tools are practically useless (from what I've tested) and for junior level tasks it would be better to train someone to do better.

Replies

motorest • 05/16/2025

> I'm aware of the _why_ but this is why the tools aren't useful for my case. If they cannot consume the codebase in a reasonable amount of time and provide value from that then they generally aren't useful in areas where I would want to use them (navigating large codebases).

I challenge you to explore different perspectives.

You are faced with a service that handles any codebase that's thrown at it with incredible ease, without requiring any tweaking or special prompting.

For some reason, the same system fails to handle your personal codebase.

What's the root cause? Does it lie in the system that works everywhere with anything you throw at it? Or is it in your codebase?

➕ show 1 reply

alt Hacker News

Replies