IMHO, LLMs are better at Python and SQL than Haskell because Python and SQL syntax mirrors more aspects of human language. Whereas Haskell syntax reads more like a math equation. These are Large _Language_ Models so naturally intelligence learned from non-code sources transfers better to more human like programming languages. Math equations assume the reader has context not included in the written down part for what the symbols mean.
They are heavily post-trained on code and math these days. I don‘t think we can infer that much about their behavior from just the pre-training dataset anymore
They are not called Context-Sensitive Large Language Models though.
LLMs are very good at bash, which I’d argue doesn’t read like language or math.
I suspect your probably right, but just for completeness, one could also make the argument that LLMs are better at writing Haskell because they are overfit to natural language and Haskell would avoid a lot of the overfit spaces and thus would generalize better. In other words, less baggage.
I have found recent models to be quite respectable at Haskell, given a couple of initial nudges on style - but that’s true of anything.