One thing im wondering with the LLM age we seem to be entering: is there value in picking up a langu...

marliechiller • yesterday at 5:40 PM • 16 replies • view on HN

One thing im wondering with the LLM age we seem to be entering: is there value in picking up a language like this if theres not going to be a corpus of training data for an LLM to learn from? Id like to invest the time to learn Gleam, but I treat a language as a tool, or a means to an end. I feel like more and more I'm reaching for the tool to get the job done most easily, which are languages that LLMs seem to gel with.

Replies

thefaux • yesterday at 6:34 PM

In the medium to long term, if LLMs are unable to easily learn new languages and remap the knowledge they gained from training on different languages, then they will have failed in their mission of becoming a general intelligence.

victorbjorklund • yesterday at 6:05 PM

I feel that was more true 1-2 years ago. These days I find Claude Code write almost as good (or as bad depending on your perspective) Elixir code as JavaScript code and there must be less Elixir code in the training data.

➕ show 3 replies

isodev • yesterday at 8:50 PM

Claude reads and writes Gleam just fine. I think as long as the language syntax is well documented (with examples) and has meaningful diagnostics, LLMs can be useful. Gleam has both brilliant docs and diagnostics rivalling Rust. Gleam is also very well designed language wise, not many reserved words, very explicit APIs… also something that helps LLMs.

Contrast with the likes of Swift - been around for years but it’s so bloated and obscure that coding agents (not just humans) have problems using it fully.

kace91 • yesterday at 5:50 PM

If you just see language as a tool, unless you’re self employed or working in open source, wouldn’t the lack of job market demand for it be the first blocker?

➕ show 1 reply

perrygeo • today at 1:33 AM

The Gleam language, yes all of it, fits in a context window (https://tour.gleam.run/everything/)

I have similar concerns to you - how well a language works with LLMs is indeed an issue we have to consider. But why do you assume that it's the volume of training data that drives this advantage? Another assumption, equally if not more valid IMO, is that languages which have fewer, well-defined, simpler constructs are easier for LLMs to generate.

Languages with sprawling complexity, where edge cases dominate dev time, all but require PBs of training data to be feasible.

Languages that are simple (objectively), with a solid unwavering mental model, can match LLMs strengths - and completely leap-frog the competition in accurate code gen.

c-hendricks • yesterday at 5:44 PM

I hope this isn't the future of "new" languages. Hopefully newer AI tools can actually learn a language and they won't be the limiting factor.

➕ show 4 replies

dragonwriter • yesterday at 6:19 PM

Its pretty much the same thing as in every previous age, where not having a community of experience and the supporting materials they produce has been a disadvantage to early adopters of a new language, so the people that used it first were people with a particular need that it seemed to address that offset that for them, or that had a particular interest in being in the vanguard.

And those people are the people that develop the body of material that later people (and now LLMs) learn from.

armchairhacker • yesterday at 5:51 PM

Gleam isn’t a very unique language. The loss from generalizing may be less than the improved ergonomics, if not now then as LLMs improve.

➕ show 1 reply

christophilus • yesterday at 8:22 PM

I recently built something in Hare (a very niche new language), and Claude Code was helpful. No where near as good as it is with TypeScript. But it was good enough that I don’t LLMs being in the top 5 reasons a language would fail to get adopted.

kryptiskt • yesterday at 9:41 PM

On the other hand, if you write a substantial amount of code in a niche languages, the LLMs will pick up your coding style as it's in a sizable chunk of the training corpus.

Hammershaft • yesterday at 5:43 PM

This was one of my bigger worries for LLM coding: we might develop path dependence on the largest tools and languages.

dnautics • yesterday at 7:06 PM

claude is really good at elixir. IME, It's really really good with a few "unofficial" tweaks to the language/frameworks, but this could be my bias. the LLM cutoff was a fear of mine, but i think it's actually the opposite. we know that as few as 250 documents can "poison" an LLM, i suspect that (for now) a small language with very higg quality examples can "poison" LLMs for the better.

epolanski • yesterday at 8:28 PM

Of course there is, especially if you believe that LLMs further improve on reasoning.

timeon • yesterday at 6:31 PM

Seems like you are not target audience for these new languages and that is OK. But I guess there are still many people that want to try new things (on their own even).

ModernMech • yesterday at 6:04 PM

Yes, because LLMs don't change the fact that different programming languages have different expressive capabilities. It's easier to say some things in some languages over others. That doesn't change if it's an LLM writing the code; LLMs have finite context windows and limited attention. If you can express an algorithm in 3000 loc in one language but 30 loc in another, the more expressive language is still preferred, even if the LLM can spit out the 3000 lines in 1s. The reason being if the resulting codebase is 10 - 100x larger than it needs to be, that has real costs that are not mitigated by LLMs or agents. All things being equal, you'd still prefer the right tool for the job, which does not imply we should use Python for everything because it dominates the training set, it means we should make sure LLMs have capabilities to write other programming languages equally well before we rely on them too much.

jedbrooke • yesterday at 6:13 PM

where do you think the corpus of training data comes from?

alt Hacker News

Replies