Chain-of-code is better than chain-of-thought because it's more grounded, more specific, and achieves a lot of useful compression. But my bet is that the proposed program-of-thought is too specific. Moving all the way from "very fuzzy specification" to "very concrete code" skips all of the space in the middle, and now there's no room to iterate without a) burning lots of tokens and b) getting bogged down in finding and fixing whatever new errors are introduced in the translated representations. IOW, when there's an error, will it be in the code itself or in the scenario that code was supposed to be representing?
I think the intuition that lots of people jumped to early about how "specs are the new code" was always correct, but at the same time it was absolutely nuts to think that specs can be represented in good ways with natural language and bullet-lists in markdown. We need chain-of-spec that's leveraging something semi-formal and then iterating on that representation, probably with feedback from other layers. Natural-language provides constraints, guess-and-check code generation is sort at the implementation level, but neither are actually the specification which is the heart of the issue. A perfect intermediate language will probably end up being something pretty familiar that leverages and/or combines existing formal methods from model-checkers, logic, games, discrete simulations, graphs, UML, etc. Why? It's just very hard to beat this stuff for compression, and this is what all the "context compaction" things are really groping towards anyway. See also the wisdom about "programming is theory building" and so on.
I think if/when something like that starts getting really useful you probably won't hear much about it, and there won't be a lot of talk about the success of hybrid-systems and LLMs+symbolics. Industry giants would have a huge vested interest in keeping the useful intermediate representation/languages a secret-sauce. Why? Well, they can pretend they are still doing something semi-magical with scale and sufficiently deep chain-of-thought and bill for extra tokens. That would tend to preserve the appearance of a big-data and big-computing moat for training and inference even if it is gradually drying up.
Perhaps something like TLA+ or PlusCal specs could be the specs in terms of 'specs are the new code'.
Delusional vibe coding bullshit. Find me one significant software project based on using natural language for the software.
I posted the Rspec version of this earlier this week on Obie's keynote on Ruby + TDD + AI.
I've been working on a project to turn markdown into a computational substrate, sort of Skills+. It embeds the computation in the act of reading the file so you don't have to teach anything (or anyone) how to do anything other than read the data on the page along with the instructions on what to do with it. It seemed the simplest way of interacting with a bunch of machines that really love to read and write text.
I use a combination of reference manuals and user guides to replace the specs as a description of intent for the input to the process. They need to be written and accurate anyways and if they're the input to the process, then how can they not be. After all
requirements = specs = user stories = tests = code = manuals = reference guides = runbooks
They're all different projections of the same intent and they tend to be rather difficult to keep in sync, so why not compress them?
https://tech.lgbt/@graeme/115642072183519873
This lets one artifact play all of the roles in the process, or for anything non-trivial, you can use compositional vectors like ${interpolation:for-data.values}, {{include:other:sections-as.whole-units}} or run special ```graphnode:my-funky-cold-medina:fetch fences that execute code found on other nodes and feed the output through middleware to transform and transpile it into the parent document.
Think of it like Rack for your thoughts.
I threw the AST thing on it because I'd been playing with that node and thought a symbol table would be useful for two reasons. Hard to hallucinate a symbol table that's being created deterministically and if I've got it, it saves scanning entire files when I'm just looking for a method.
New computing paradigms sometimes require new new tools.
I think you're absolutely right about the rest of it. LLM assisted development is the process of narrowing the solution space by contextual constraints and then verifying the outcome matches the intent. We need to develop tools on both ends of that spectrum in order to take full advantage of them.
Try /really/ telling one what you want next time, instead of how to do it. See if your results are any better