Which makes for an interesting thought / discussion; code is written to be read by humans first, executed by computers second. What would code look like if it was written to be read by LLMs? The way they work now (or, how they're trained) is on human language and code, but there might be a style that's better for LLMs. Whatever metric of "better" you may use.
Just a thought experiment, I very much doubt I'm the first one to think of it. It's probably in the same line of "why doesn't an LLM just write assembly directly"
> It's probably in the same line of "why doesn't an LLM just write assembly directly"
My suspicion is that the "language" part of LLMs means they tend to prefer languages which are closer to human languages than assembly and benefit from much of the same abstractions and tooling (hence the recent acquisition of bun and astral).
The problem with that is that assembly isn't portable, and x86 isn't as dominant as it once was, so then you've got arm and x86(_64). But you could target the LLVM machine if you wanted.
LLMs read and write human-code because humans have been reading and writing human-code. The sample size of assembly problems is, in my estimate, too small for LLMs to efficiently read and write it for common use cases.
I liken it to the problem of applying machine learning to hard video games (e.g. Starcraft). When trained to mimic human strategies, it can be extremely effective, but machine learning will not discover broadly effective strategies on a reasonable timescale.
If you convert "human strategies" to "human theory, programming languages, and design patterns", perhaps the point will be clear.
But: could the ouroboric cycle of LLM use decay the common strategies and design patterns we use into inexplicable blobs of assembly? Can LLMs improve at programming if humans do not advance the theory or invent new languages, patterns, etc?