Except if you spend quality time with coding agents you realize that's not actually true.
They're equally useful for novel tasks because they don't work by copying large scale patterns from their training data - the recent models can break down virtually any programming task to a bunch of functions and components and cobble together working code.
If you can clearly define the task, they can work towards a solution with you.
The main benefit of concepts already in the training data is that it lets you slack off on clearly defining the task. At that point it's not the model "cheating", it's you.
> Except if you spend quality time with coding agents you realize that's not actually true.
Agent engineering seems to be (from the outside!) converging on quality lived experience. Compared to Stone Age manual coding it’s less about technical arguments and more about intuition.
Vibes in short.
You can’t explain sex to someone who has not had sex.
Any interaction with tools is partly about intuition. It’s a difference of degree.
Simon, do you happen to have some concrete examples of a model doing a great job at a clearly novel, clearly non-trivial coding task?
I'd find it very interesting to see some compelling examples along those line.