> I've had the same issue every time I've tried it. The code I generally work on is embedded C/C++ with in-house libraries where the tools are less than useful as they try to generate non-existant interfaces and generally generate worse code than I'd write by hand.
That's because whatever training the model had, it didn't covered anything remotely similar to the codebase you worked on.
We get this issue even with obscure FLOSS libraries.
When we fail to provide context to LLMs, they generate examples by following supperficial queues like coding conventions. In extreme cases, such as code that employs source code generators or templates, LLMs even fill in function bodies that code generators are designed to generate for you. That's because, if LLMs are oblivious to the context, they resort to hallucinate their way into something seemingly coherent. Unless you provide them with context or instruct them not to make up stuff, they will resort to bullshit their way into an example.
What's truly impressive about this is that often times the hallucinated code actually works.
> Generating function documentation hasn't been that useful either as the doc comments generated offer no insight and often the amount I'd have to write to get it to produce anything of value is more effort than just writing the doc comments myself.
Again,this suggest a failure on your side for not providing any context.
If you give it enough context LLMs synthesize and present them almost instantly. If you're prompting a LLM to generate documentation, which boils down to synthesizing what an implementation does and what's their purpose,and the LLM comes up empty, that means you failed to give it anything to work on.
The bulk of your comment screams failure to provide any context. If your code steers far away from what it expects, fails to follow any discernible structure, and doesn't even convey purpose and meaning in little things like naming conventions, you're not giving the LLM anything to work on.
I'm aware of the _why_ but this is why the tools aren't useful for my case. If they cannot consume the codebase in a reasonable amount of time and provide value from that then they generally aren't useful in areas where I would want to use them (navigating large codebases). If the codebase is relatively small or the problem is known then an LLM is not any better than tab-complete and arguably worse in many cases as the generated result has to be parsed and added to my mental model of the problem rather than the mental model being constructed while working on the code itself.
I guess my point is, I have no use for LLMs in their current state.
> That's because whatever training the model had, it didn't covered anything remotely similar to the codebase you worked on. > We get this issue even with obscure FLOSS libraries.
This is the issue however as unfamiliar codebases is exactly where I'd want to use such tooling. Not working in those cases makes it less than useful.
> Unless you provide them with context or instruct them not to make up stuff, they will resort to bullshit their way into an example.
In all cases context was provided extensively but at some point it's easier to just write the code directly. The context is in surrounding code which if the tool cannot pick up on that when combined with direction is again less than useful.
> What's truly impressive about this is that often times the hallucinated code actually works.
I haven't experienced the same. It fails more often than not and the result is much worse than the hand-written solution regardless of the level of direction. This may be due to unfamiliar code but again, if code is common then I'm likely familiar with it already thus lowering the value of the tool.
> Again,this suggest a failure on your side for not providing any context.
This feels like a case of blaming the user without full context of the situation. There are comments, the names are descriptive and within reason, and there's annotation of why certain things are done the way they are. The purpose of a doc comment is not "this does X" but rather _why_ you want to use this function and it's purpose which is something LLMs struggle to derive from my testing of them. Adding enough direction to describe such is effectively writing the documentation with a crude english->english compiler between. This is the same problem with unit test generation where unit tests are not to game code coverage but to provide meaningful tests of the domain and known edge cases of a function which is again something the LLM struggles with.
For any non-junior task LLM tools are practically useless (from what I've tested) and for junior level tasks it would be better to train someone to do better.