I don't understand the productivity that people get out of these AI tools. I've tried it and I just can't get anything remotely worthwhile unless it's something very simple or something completely new being built from the ground up.
Like sure, I can ask claude to give me the barebones of a web service that does some simple task. Or a webpage with some information on it.
But any time I've tried to get AI services to help with bugfixing/feature development on a large, complex, potentially multi-language codebase, it's useless.
And those tasks are the ones that actually take up the majority of my time. On the occasion that I'm spinning a new thing up quickly, I don't really need an AI to do it for me -- I mean, that's the easy part!
Is there something I'm missing? Am I just not using it right? I keep seeing people talk about how addictive it is, how the productivity boost is insane, how all their code is now written by AI and then audited, and I just don't see how that's possible outside of really simple rote programming.
I've had the same issue every time I've tried it. The code I generally work on is embedded C/C++ with in-house libraries where the tools are less than useful as they try to generate non-existant interfaces and generally generate worse code than I'd write by hand. There's a need for correctness and being able to explain the code thus use of those tools is also detrimental to explainability unless I hand-hold it to the point where I'm writing all of the code myself.
Generating function documentation hasn't been that useful either as the doc comments generated offer no insight and often the amount I'd have to write to get it to produce anything of value is more effort than just writing the doc comments myself.
For my personal project in zig they either get lost completely or gives me terrible code (my code isn't _that_ bad!). There seems to be no middle ground here. I've even tried the tools as pair programmers but they often get lost or stuck in loops of repeating the same thing that's already been mentioned (likely falls out of the context window).
When it comes to others using such tools I've had to ask them to stop using it to think as it becomes next to impossible to teach / mentor if they're passing that I say to the LLM or trying to have it perform the work. I'm confident in debugging people when it comes to math / programming but with an LLM between it's just not possible to guess where they went wrong or how to bring them back to the right path as the throught process is lost (or there wasn't one to begin with).
This is not even "vibe coding", I've just never found it generally useful enough to use day-to-day for any task and my primary use of say phind has been to use it as an alternative to qwant when I cannot game the search query well enough to get the search results I'm looking for (i.e I ignore the LLM output and just look at the references).
Yesterday I gave cursor a try and made my first (intentionally very lazy) vibe coding approach (a simple threejs project). It accepted the task and did things, failed, did things, failed, did things ... failed for good.
I guess I could work on the magic incantations to tweak here and there a bit until it works and I guess that's the way it is done. But I wasn't hooked.
I do get value out of LLM's for isolated broken down subtasks, where asking a LLM is quicker than googling.
For me, AI will probably become really usefull, once I can scan and integrate my own complex codebase so it gives me solutions that work there and not hallucinate API points or jump between incompatible libary versions (my main issue).
I find it's incredibly helpful for prototyping. These tools quickly reach a limit of complexity and put out sub par code, but for a green field prototype that's ok.
I've successfully been able to test out new libraries and do explorations quickly with AI coding tools and I can then take those working examples and fix them up manually to bring them up to my coding standards. I can also extend the lifespan of coding tools by doing cleanup cycles where I manually clean up the code since they work better with cleaner encapsulation, and you can use them to work on one scoped component at a time.
I've found that they're great to test out ideas and learn more quickly, but my goal is to better understand the technologies I'm prototyping myself, I'm not trying to get it to output production quality code.
I do think there's a future where LLMs can operate in a well architected production codebase with proper type safe compilation, linting, testing, encapsulation, code review, etc, but with a very tight leash because without oversight and quality control and correction it'll quickly degrade your codebase.
> Is there something I'm missing? Am I just not using it right?
The talk about it makes more sense when you remember most developers are primarily writing CRUD webapps or adware, which is essentially a solved problem already.
Just yesterday I wanted to adjust the Renovate setup in a repo.
I could spend 5-10 minutes digging on through the docs for the correct config option, or I can just tap a hotkey, open up GitHub Copilot in Rider and tell it what I want to achieve.
And within seconds it had a correct-looking setting ready to insert to my renovate.json file. I added it, tested it and it works.
I kinda think people who diss AIs are prompting something like "build me Facebook" and then being disappointed when it doesn't :D
I appreciate you voicing your feelings here. My previous employer requested we try AI tooling for productivity purposes, and I was finding myself in similar scenarios to what you mention. The parts that would have benefitted from a productivity gain weren’t seeing any improvement, while the areas that saw a speedup weren’t terribly mission-critical.
The one thing I really appreciated though was the AI’s ability to do a “fuzzy” search in occasional moments of need. Or, for example, sometimes the colloquial term for a feature didn’t match naming conventions in source code. The AI could find associations in commit messages and review information to save me time rummaging through git-blame. Like I said though, that sort of problem wasn’t necessarily a bottleneck and could often be solved much more cheaply by asking around coworker on Slack.
Probably 80% of the time I spend coding, I'm inside a code file I haven't read in the last month. If I need to spend more than 30 seconds reading a section of code before I understand it, I'll ask AI to explain it to me. Usually, it does a good job of explaining code at a level of complexity that would take me 1-15 minutes to understand, but does a poor job of answering more complex questions or at understanding more complex code.
It's a moderately useful tool for me. I suspect the people that get the most use out of are those that would take more than 1 hour to read code I would take 10 minutes to read. Which is to say the least experienced people get the most value.
>large, complex, potentially multi-language
I find that token memory size limits are the main barrier here.
Once the LLM starts forgetting other parts of the application, all bets are off and it will hallucinate the dumbest shit, or even just remove features wholesale.
You are not alone! I strongly agree and I feel like I am losing my mind reading some of the comments people have about these services.
For my workflow a lot of the benefit is in the smaller tasks. For example, get a medium size diff from two change-sets, paste it in, and ask to summarize the “what” and “why”. You have to learn how to give AI the right amount of context to get the right result back.
For code changes I prefer to paste a single function in, or a small file, or error output from a compile failure. It’s pretty good at helping you narrow things down.
So, for me, it’s a pile of small gains where the value is—because ultimately I know what I generally want to get done and it helps me get there.
I’ve had good experiences using it, but with the caveat that only Gemini Pro 2.5 has been at all useful, and only for “spot” tasks.
I typically use it to whip up a CLI tool or script to do something that would have been too fiddly otherwise.
While sitting in a Teams meeting I got it to use the Roslyn compiler SDK in a CLI tool that stripped a very repetitive pattern from a code base. Some OCD person had repeated the same nonsense many thousands of times. The tool cleaned up the mess in seconds.
Some people do really repetitive or really boilerplate things, others do not.
Also you have to learn to talk to it and how to ask it things.
The first and most important question to ask here is: are you using a coding agent? A lot of times, people who aren't getting much out of LLM-assisted coding are just asking Claude or GPT for code snippets, and pasting and building them themselves (or, equivalently, they're using LLM-augmented autocomplete in their editor).
Almost everybody doing serious work with LLMs is using an agent, which means that the LLM is authoring files, linting them, compiling them, and iterating when it spots problems.
There's more to using LLMs well than this, but this is the high-order bit.