Read the first few comments and surprised I didn’t see it, but training data. The voluminous amount of Python in the training data.
I could write in brainfuck with ai, but I presume, wouldn’t get the same results than if going with python.
My follow up question: with AI now, why care about a lang until you need to?
I had an itch to give Perl another go after a 5 year hiatus. I wanted a super simple way to spawn a proxy I was building in Go, along with writing various integration tests. I used Claude Code to write the bulk of it and found Claude to be remarkable good at Perl. I told Claude to only use what’s built into Perl’s standard library rather than reaching for anything in CPAN. Turns out everything from HTTP clients, TLS and JSON are all builtin which makes it a very stable and easy way to replace what I would normally have implemented in shell scripts. My theory is because Perl hasn’t changed all that much and has a ton of training data that Claude is actually quite good at Perl for cases where you might think to write shell scripts.
With AI it is important to catch errors/hallucinations early, static typing helps with that.
So languages with dynamic typing might hide some errors until runtime, static typing one could catch that during compilation.
With dynamic ones you need way more tests to cover some of the scenarios that compiler does for others.
And there is significant amount of code written "for ages" in languages that were there longer, like C, C++, Java (yes, I know that python is quite old, older than Java - 1991).
Just use Go. LLMs have seen a ton of it, they write it well, it compiles practically instantly, and it has all the advantages of a typed compiled language.
I created a big Python codebase using AI, and the LLM constantly guesses arguments or dictionary formats wrong. Unit tests and stuff like pydantic help, but it's better to avoid that whole class of runtime errors altogether.
Training data can't be the whole answer. LLMs are really good at translating to different programming languages. This makes sense, given that they are derived from text translation systems. I'm getting great results in languages with comparatively small bodies of freely available code. The bigger hurdle is usually that LLMs tend to copy common idioms in the target language and if it is an "enterprise-y" language like Java or C#, the amount of useless boilerplate can skyrocket immediately, which creates a real danger that the result grows beyond the usable context window size and the quality suffers.
For some people reducing infra costs matter. Python is very very slow, even if it uses native libs.
That would matter if we were asking the AI to generate code open-loop: someone probably already wrote something close to what you asked for in Python. But if the agent generates code, tries to compile it, sees the detailed error messages and acts on those messages to refine the code, it's going to produce a higher quality result. rustc produces really good diagnostics. And there's a lot of Rust code online now, even if there's so much more Python and Javascript/Typescript.
I built a programming language, and LLMs can code phenomenally well in it.
I don't think the training set matters that much, since there's no way they have my language in their training set!
Programming languages have a lot in common. Python is kind of odd when it comes to languages.
People really need to stop assuming more training data the better. This is not how it works. LLM thrive off consistency.
Go for example has significantly less training data than Python, but LLMs are the best at it. Why? Go is often written the same. You go from project to project and the code looks all the same. There only a very few ways to write Go.
Also, every single interpreter error has an entire corpus of StackOverflow-esque fix suggestions alongside it, and the model has been fine-tuned to minimize such errors on the first try. This hasn't been done for more obscure languages. You'll likely take more turns, on average, to get a working output, even if your problem is fully verifiable via test input/outputs - and if it's not verifiable, you don't want the "attention" of the model focused on syntax rather than the solution.
Large volumes of training data is a blessing and a curse, especially when you consider who wrote it.
> I could write in brainfuck with ai, but I presume, wouldn’t get the same results than if going with python.
Admittedly, I have very little experience with LLM-assisted Python. However, based on the severe degradation in output quality I have seen from an LLM working with plain JavaScript as opposed to TypeScript, I can't imagine choosing to start a project in Python at the moment.
I wrote about the meta thesis of programming languages in the training data here
Seems to me these LLMs have a critical mass of Python training data and Rust training data, so there's no advantage for Python there.
So as the article points out, an iterative process that catches the mistakes at compile time is much more suited for an AI than one that catches them at runtime.
The LLMs are actually worse at generating Python than other langs, hypothesized due to quality of training data lol.
I still read the generated code, so I'm not quite willing to give up on Python yet though.
I loved from writing all my code with LLMs from Python to Rust. I’ve seen absolutely no difference, most of the time I couldn’t even tell you which it’s writing in.
My programs are faster and more reliable than they’ve ever been.
1) the models do generalise so concepts translate 2) languages with more opinionated semantics and a better, more coherent community seem to be better. Python is a broad shitshow with multiple ways to achieve the same thing. Elixir is tight and focused. Claude is much better at elixir.
I wouldn't say I get worse results with Go than I do with Python.
that's right, we dont need to care about a lang, same as we dont care about Map when FSD promise its already end to end optimal one.
There's enough training data on the other langs.
> Read the first few comments and surprised I didn’t see it, but training data. The voluminous amount of Python in the training data.
That's actually part of the point. Almost no one writes types for Python and has complete type compliance. So all that training data is people just yoloing Python, writing a bunch of poor code in it.
I honestly can't believe any experienced software engineer would decide to build systems in Python these days.
[dead]
No if that mattered you'd write everything in html and css. Because that has way more training data.
"I could write in brainfuck with ai"
Well, go on and do the experiment! Perhaps LLMs can right code as well in BF as Python but I don't recommend it because hallucinations are really hard to notice in BF.
If you are going to worry about high level computer languages and AI, you are going to have to start with getting to grips with machine code and assemblers and that. Once you know how say some Python code ends up being processed by your laptop CPU(s), then you will know when BF might be best!
Surprisingly, LLMs are actually much worse at reasoning in Python than other common programming languages for agentic coding tasks.
Data here: https://gertlabs.com/rankings?mode=agentic_coding