> If your metric is an LLM that can copy/paste without alterations, and never hallucinate APIs, then yeah, you'll always be disappointed with them.
I struggle to take comments like this seriously - yes, it is very reasonable to expect these magical tools to copy and paste something without alterations. How on earth is that an unreasonable ask?
The whole discourse around LLMs is so utterly exhausting. If I say I don't like them for almost any reason, I'm a luddite. If I complain about their shortcomings, I'm just using it wrong. If I try and use it the "right" way and it still gets extremely basic things wrong, then my expectations are too high.
What, precisely, are they good for?
I think what they're best at right now is the initial scaffolding work of projects. A lot of the annoying bootstrap shit that I hate doing is actually generally handled really well by Codex.
I agree that there's definitely some overhype to them right now. At least for the stuff I've done they have gotten considerably better though, to a point where the code it generates is often usable, if sub-optimal.
For example, about three years ago, I was trying to get ChatGPT to write me a C program to do a fairly basic ZeroMQ program. It generated something that looked correct, but it would crash pretty much immediately, because it kept trying to use a pointer after free.
I tried the same thing again with Codex about a week ago, and it worked out of the box, and I was even able to get it to do more stuff.
For a long time, I've wanted to write a blog post on why programmers don't understand the utility of LLMs[1], whereas non-programmers easily see it. But I struggle to articulate it well.
The gist is this: Programmers view computers as deterministic. They can't tolerate a tool that behaves differently from run to run. They have a very binary view of the world: If it can't satisfy this "basic" requirement, it's crap.
Programmers have made their career (and possibly life) being experts at solving problems that greatly benefit from determinism. A problem that doesn't - well either that needs to be solved by sophisticated machine learning, or by a human. They're trained on essentially ignoring those problems - it's not their expertise.
And so they get really thrown off when people use computers in a nondeterministic way to solve a deterministic problem.
For everyone else, the world, and its solutions, are mostly non-deterministic. When they solve a problem, or when they pay people to solve a problem, the guarantees are much lower. They don't expect perfection every time.
When a normal human asks a programmer to make a change, they understand that communication is lossy, and even if it isn't, programmers make mistakes.
Using a tool like an LLM is like any other tool. Or like asking any other human to do something.
For programmers, it's a cardinal sin if the tool is unpredictable. So they dismiss it. For everyone else, it's just another tool. They embrace it.
[1] This, of course, is changing as they become better at coding.
Its strong enough to replace humans at their jobs and weak enough that it cant do basic things. Its a paradox. Just learn to be productive with them. Pay $200/month and work around with its little quirks. /s
It seems like just such a weird and rigid way to evaluate it? I am a somewhat reasonable human coder, but I can't copy and paste a bunch of code without alterations from memory either. Can someone still find a use for me?