At this point, I am starting to feel like we don’t need new languages, but new ways to create specifications.
I have a hypothesis that an LLM can act as a pseudocode to code translator, where the pseudocode can tolerate a mixture of code-like and natural language specification. The benefit being that it formalizes the human as the specifier (which must be done anyway) and the llm as the code writer. This also might enable lower resource “non-frontier” models to be more useful. Additionally, it allows tolerance to syntax mistakes or in the worst case, natural language if needed.
In other words, I think llms don’t need new languages, we do.
I think this confuses two different things:
- LLMs can act as pseudocode to code translators (they are excellent at this)
- LLMs still create bugs and make errors, and a reasonable hypothesis is at a rate in direct proportion to the "complexity" or "buggedness" of the underlying language.
In other words, give an AI a footgun and it will happily use it unawares. That doesn't mean however it can't rapidly turn your pseudocode into code.
None of this means that LLMs can magically correct your pseudocode at all times if your logic is vastly wrong for your goal, but I do believe they'll benefit immensely from new languages that reduce the kind of bugs they make.
This is the moment we can create these languages. Because LLMs can optimize for things that humans can't, so it seems possible to design new languages to reduce bugs in ways that work for LLMs, but are less effective for people (due to syntax, ergonomics, verbosity, anything else).
This is crucially important. Why? Because 99% of all code written in the next two decades will be written by AI. And we will also produce 100x more code than has ever been written before (because the cost of doing it, has dropped essentially to zero). This means that, short of some revolutions in language technology, the number of bugs and vulnerabilities we can expect will also 100x.
That's why ideas like this are needed.
I believe in this too and am working on something also targeting LLMs specifically, and have been working on it since Mid to Late November last year. A business model will make such a language sustainable.
I disagree I think we always need new languages. Every language over time becomes more and more unnecessarily complex.
It's just part of the software lifecycle. People think their job is to "write code" and that means everything becomes more and more features, more abstractions, more complex, more "five different ways to do one thing".
Many many examples, C++, Java esp circa 2000-2010 and on and on and on. There's no hope for older languages. We need simpler languages.
Ah, people are starting to see the light.
This is something that could be distilled from some industries like aviation, where specification of software (requirements, architecture documents, etc.) is even more important that the software itself.
The problem is that natural language is in itself ambiguous, and people don't really grasp the importance of clear specification (how many times I have repeated to put units and tolerances to any limits they specify by requirements).
Another problem is: natural language doesn't have "defaults": if you don't specify something, is open to interpretation. And people _will_ interpret something instead of saying "yep I don't know this".
We're already at a point where I think a PR should start with an LLM prompt that fully specs out the change/feature.
And then we can look at multiple LLM-generated implementations to inform how the prompt might need to be updated further until it's a one-shot.
Now you have perfect intention behind code, and you can refine the intention if it's wrong.
I’ve been on a similar train of thought. Just last weekend I built a little experiment, using LLMs to highlight pseudocode syntax:
we have some markup for architectures like - d2lang, sequencediagram.org's, bpmn.io xmls (which are OMG XMLs), so question is - can we master these, and not invent new stuf for a while?
p.s. a combination of the above fares very well during my agentic coding adventures.
This is the approach that Agint takes. We inference the structure of the code first top down as a graph, then add in types, then interpret the types as in out function signatures and then "inpaint" the functions for codegen.
I'm actually building this, will release it early next month. I've added a URL to watch to my profile (should be up later this week). It will be Open Source.
And so it comes full circle XD.
>>new ways to create specifications.
Thats again programming languages. Real issue with LLMs now is it doesn't matter if it can generate code quickly. Some one still has to read, verify and test it.
Perhaps we need a need a terse programming language. Which can be read quickly and verified. You could call that specification.
I think that we haven't even started to properly think about a higher-level spec language. The highest level objects would have to be the users and the value that they get out of running the software. Even specific features would have to be subservient to that, and would shift as the users requirements shift. Requirements written in a truly higher-level spec language would allow the software to change features without the spec itself changing.
This is where LLMs slip up. I need a higher-level spec language where I don't have to specify to an LLM that I want the jpeg crop to be lossless if possible. It's doubly obvious that I wouldn't want it to be lossy, especially because making it lossy likely makes the resulting files larger. This is not obvious to an LLM, but it's absolutely obvious if our objects are users and user value.
A truly higher-level spec language compiler would recognize when actual functionality disappeared when a feature was removed, and would weigh the value of that functionality within the value framework of the hypothetical user. It would be able to recognize the value of redundant functionality by putting a value on user accessibility - how many ways can the user reach that functionality? How does it advertise itself?
We still haven't even thought about it properly. It's that "software engineering" thing that we were in a continual argument about whether it existed or not.
So in this case an LLM would just be a less-reliable compiler? What's the point? If you have to formally specify your program, we already have tools for that, no boiling-the-oceans required
coding in latex and then translating to the target via llm works remarkably well nowadays
llm works great in closed loop so they can self correct but we don't have a reliable way to lint and test specs we need a new language for that
> The benefit being that it formalizes the human as the specifier (which must be done anyway) and the llm as the code writer.
The code was always a secondary effect of making software. The pain is in fully specifying behavior.
Go read ai2027 and then be ashamed of yourself /s
But seriously, llms can transmit ideas to each other through English that we do understand, we are screwed if it’s another language lol
What we need is a programming language that defines the diff to be applied upon the existing codebase to the same degree of unambiguity as the codebase itself.
That is, in the same way that event sourcing materializes a state from a series of change events, this language needs to materialize a codebase from a series of "modification instructions". Different models may materialize a different codebase using the same series of instructions (like compilers), or say different "environmental factors" (e.g. the database or cloud provider that's available). It's as if the codebase itself is no longer the important artifact, the sequence of prompts is. You would also use this sequence of prompts to generate a testing suite completely independent of the codebase.