It's interesting that Claude is able to effectively write Elixir, even if it isn't super idiomatic without established styles in the codebase, considering Elixir is a pretty niche and relatively recent language.
What I'd really like to see though is experiments on whether you can few shot prompt an AI to in-context-learn a new language with any level of success.
I would argue effectiveness point.
It's certainly helpful, but has a tendency to go for very non idiomatic patterns (like using exceptions for control flow).
Plus, it has issues which I assume are the effect of reinforcement learning - it struggles with letting things crash and tends to silence things that should never fail silently.
I tried different LLMs with various languages so far: Python, C++, Julia, Elixir and JavaScript.
The SOTA models come do a great job for all of them, but if I had to rank the capabilities for each language it would look like this:
JavaScript, Julia > Elixir > Python > C++
That's just a sample size of one, but I suspect, that for all but the most esoteric programming languages there is more than enough code in the training data.
You can accurately describe elixir syntax in a few paragraphs, and the semantics are pretty straightforward. I’d imagine doing complex supervision trees falls flat.
Unless that new language has truly esoteric concepts, it's trivial to pattern-match it to regular programming constructs (loops, functions, ...)
I gave a talk about this. Without evidence, I suspect it's due to the "poisoning" phenomenon, only a few examples (~250 IIRC) is enough to push the needle, seemingly independent of LLM parameter count. Elixir has some really high quality examples available so, there is likely a "positive poisoning" effect.