Show HN: An LLM-optimized programming language

42 points • by ImJasonH • today at 3:11 AM • 31 comments • view on HN

Comments

An LLM is optimized for its training data, not for newly built formats or abstractions. I don’t understand why we keep building so-called "LLM-optimized" X or Y. It’s the same story we’ve seen before with TOON.

➕ show 1 reply

cpeterso • today at 7:02 AM

If a new programming language doesn’t need to be written by humans (though should ideally still be readable for auditing), I hope people research languages that support formal methods and model checking tools. Formal methods have a reputation for being too hard or not scaling, but now we have LLMs that can write that code.

https://martin.kleppmann.com/2025/12/08/ai-formal-verificati...

➕ show 1 reply

atlintots • today at 2:20 PM

I'm assuming OP is not aware of APL, J, or similar array programming languages.

giancarlostoro • today at 11:26 AM

The real question is what existing language is perfect for LLMs? Is Lisp? ASM? We know some LLMs are better at some languages but what existing language are they best at? Would be interesting to see. I know one spot they all fail at is niche programming libraries. They have to pull down docs or review raw code pulled down for the dependency, issue is in some languages like C# those dependencies are precompiled to bytecode, Java too.

➕ show 2 replies

discrisknbisque • today at 4:21 AM

The Validation Locality piece is very interesting and really got my brain going. Would be cool to denote test conditions in line with definitions. Would get gross for a human, but could work for an LLM with consistent delimiters. Something like (pseudo code):

``` fn foo(name::"Bob"|genName(2)): if len(name) < 3 Err("Name too short!")

  print("Hello ", name)
    return::"Hello Bob"|Err

```

Right off the bat I don't like that it relies on accurately remembering list indexes to keep track of tests (something you brought up), but it was fun to think about this and I'll continue to do so. To avoid the counting issue you could provide tools like "runTest(number)", "getTotalTests", etc.

One issue: The Loom spec link is broken.

➕ show 1 reply

AlexCoventry • today at 8:22 AM

I'm looking for a language optimized for use with coding agents. Something which helps me to make a precise specification, and helps the agent meet all the specified requirements.

➕ show 2 replies

forgotpwd16 • today at 8:10 AM

There was one other just yesterday: https://news.ycombinator.com/item?id=46571166

Mathnerd314 • today at 4:32 AM

I get that this is essentially vibe coding a language, but it still seems lazy to me. He just asked the language model zero-shot to design a language unprompted. You could at least use the Rosetta code examples and ask it to identify design patterns for a new language.

➕ show 2 replies

evacchi • today at 8:00 AM

weeks ago I was also noodling around with the idea of programming languages for LLMs, but as a means to co-design DSLs https://blog.evacchi.dev/posts/2025/11/09/the-return-of-lang...

Surac • today at 8:42 AM

so where are the millions of line code you need to train the LLM in your new language? Remember AI is just a statistic prediction thing. No input -> No output

middayc • today at 10:10 AM

He has good points about languages.

But it reminds me of the SEO guys optimizing for search engines. At the end of the day, the real long term strategy is to just "make good content", or in this case, "make a good language".

In the futuristic :) long term, in "post programming-language world" I predict each big llm provider will have its own propertiary compiler/vm/runtime. Why basically do transpiling if you can own the experience and result 100% and compete on that with other llm providers.

mooktakim • today at 11:55 AM

Just get LLM to write assembly

petesergeant • today at 4:16 AM

A language is LLM-optimized if there’s a huge amount of high-quality prior art, and if the language tooling itself can help the LLM iterate and catch errors

internet_points • today at 8:04 AM

llm-optimized in reality would mean you asked and answered millions of stackoverflow questions about it and then waited a year or so for all the major models to retrain.

➕ show 1 reply

mike_hearn • today at 10:01 AM

I've thought about this too.

The primary constraint is the size of the language specification. Any new programming language starts out not being in the training data, so in context learning is all you've got. That makes it similar to a compression competition - the size of the codec is considered to be a part of the output size in such contests, so you have to balance codec code against how effective it is. You can't win by making a gigantic compressor that produces a tiny output.

To me that suggests starting from a base of an existing language and using an iterative tree-based agent exploration. It's a super expensive technique and I'm not sure the ROI is worth it, but that's how you'd do it. You don't want to create a new language from scratch.

I don't think focusing on tokenization makes sense. The more you drift from the tokenization of the training text the harder it will be for the model to work, just like with a human (and that's what the author finds). At best you might get small savings by asking it to write in something like Chinese, but the GPT-4/5 token vocabularies already have a lot of programming related tokens like ".self", ".Iter", "-server" and so on. So trying to make something look shorter to a human can easily be counter productive.

A better approach is to look at where models struggle and try to optimize a pre-existing language for those issues. It might all be rendered obsolete by a better model released tomorrow of course, but what I see is problems like this:

1. Models often want to emit imports or fully qualified names into the middle of code, because they can't go backwards and edit what they already emitted to add an import line at the top. So a better language for an LLM would be one that doesn't require you to move the cursor upwards as you type, e.g. Python/JS benefits here because you can run an import statement anywhere, languages like Java or Kotlin are just about workable because you can write out names in full and importing something is just a convenience, but languages that force you to import types only at the very top of the file are going to be hell for an LLM.

Taking this principle further it may be useful to have a PL that lets you emit "delete last block" type tokens (smarter ^H). If the model emits code that it then realizes was wrong, it no longer has to commit to it and build on it anyway, it can wipe it and redo it. I've often noticed GPT-5 use "no op" patterns when it emits patches, where it deletes a line and then immediately re-adds the exact same line, and I think it's because it changed what it wanted to do half way through emitting a patch but had no way to stop except by doing a no-op.

The nice thing about this idea is that it's robust to model changes. For as long as we use auto-regression this will be a problem. Maybe diffusion LLMs find it easier but we don't use those today.

2. As the article notes, models can struggle with counting indentation especially when emitting patches. That suggests NOT using a whitespace sensitive language like Python. I keep hearing that Python is the "language of AI" but objectively models do sometimes still make mistakes with indentation. In a brace based language this isn't a problem, you can just mechanically reformat any file that the LLM edits after it's done. In a whitespace sensitive language that's not an option.

3. Heavy use of optional type inference. Types communicate lots of context in a small number of tokens, but demanding the model actually write out types is also inefficient (it knows in its activations what the types are meant to be). So what you want is to encourage the model to rely heavily on type inference even if the surrounding code is explicitl, then use a CLI tool that automatically adds in missing type annotations, i.e. you enrich the input and shrink the output. TypeScript, Kotlin etc - all good for this. Languages like Clojure, I think not so good, despite it being apparently token efficient on the surface.

4. In the same way you want to let the model import code half way through a file, it'd be good to also be able to add dependencies half way through a file, without needing to manually edit a separate file somewhere else. Even if it's redundant, you should be able to write something like "import('@foo/bar:1.2.3').SomeType.someMethod". Languages like JS/TS are the closest to this. You can't do it in most languages, where the definition of a package+version is very far both textually and semantically from the place where it's used.

5. Agree with the author that letting test and production code be interleaved sounds helpful. Models often forget to write tests but are good at following the style of what they see. If they see test code intermixed with the code they're reading and writing they're more likely to remember to add tests.

There's probably dozens of ideas like these. The nice thing is, if you implement it as a pre-processor on top of some other language, you exploit the exiting training data as much as possible and in fact the codebase it's working on becomes 'training data' as well just via ICL.

sublinear • today at 8:55 AM

I think I've come full circle back to the idea that a human should write the high-level code unassisted, and the LLM should autocomplete the glue code and draft the function implementations. The important part is that the human maintains these narrow boundaries and success criteria within them. The better the scaffolding, the better the result.

Nothing else really seems to make sense or work all that well.

On the one extreme you have people wanting the AI to write all the code on vibes. On the other extreme you have people who want agents that hide all low-level details behind plain english except the tool calls. To me these are basically the same crappy result where we hide the code the wrong way.

I feel like what we really need is templating instead of vibes or agent frameworks. Put another way, I just want the code folding in my editor to magically write the code for me when I unfold. I just want to distribute that template and let the user run it in a sandbox. If we're going to hide code from the user at least it's not a crazy mess behind the scenes and the user can judge what it actually does when the template is written in a "literate code" style.

rvz • today at 4:53 AM

> Humans don't have to read or write or undestand it. The goal is to let an LLM express its intent as token-efficiently as possible.

Maybe in the future, humans don't have to verify the spelling, logic or grounding truth either in programs because we all have to give up and assume that the LLM knows everything. /s

Sometimes, I read these blogs from vibe-coders that have become completely complacent with LLM slop, I have to continue to remind others why regulations exist.

Imagine if LLMs should become fully autonomous pilots on commercial planes or planes optimized for AI control and the humans just board the plane and fly for the vibes, maybe call it "Vibe Airlines".

Why didn't anyone think of that great idea? Also completely remove the human from the loop as well?

Good idea isn't it?

➕ show 2 replies

alt Hacker News

Show HN: An LLM-optimized programming language

Comments