You've mistaken the battlefield. This isn't about descriptive grammar. It's about the...

rfv6723 • today at 11:16 AM • 2 replies • view on HN

You've mistaken the battlefield. This isn't about descriptive grammar. It's about the decades-long dominance of Chomsky's entire philosophy of language.

His central argument has always been that language is too complex and nuanced to be learned simply from exposure. Therefore, he concluded, humans must possess an innate, pre-wired "language organ"—a Universal Grammar.

LLMs are a spectacular demolition of that premise. They prove that with a vast enough dataset, complex linguistic structure can be mastered through statistical pattern recognition alone.

The panic from Chomsky and his acolytes isn't that of a humble linguist. It is the fury of a high priest watching a machine commit the ultimate heresy: achieving linguistic mastery without needing his innate, god-given grammar.

Replies

raincole • today at 11:31 AM

> LLMs are a spectacular demolition of that premise.

It really isn't. While I personally think the Universal Grammar theory is flawed (or at least Chomsky's presentation is flawed), LLM doesn't debunk it.

Right now we have machines that recognized faces better than humans. But it doesn't mean humans do not have some innate biological "hardware" for facial recognition that machines don't possess. The machines simply outperform the biological hardware with their own different approach.

Also, I highly recommend you express your ideas with your own words instead of letting an LLM present them. It's painfully obvious.

adrian_b • today at 11:44 AM

I do not see how it can be claimed that "LLMs are a spectacular demolition of that premise", because LLMs must be trained on an amount of text far greater than that to what a human is exposed.

I have learned one foreign language just by being exposed to it almost daily, by watching movies spoken in that language, without using any additional means, like a dictionary or a grammar (because none were available where I lived; this was before the Internet). However, I have been helped in guessing the meaning of the words and the grammar of the language, not only by seeing what the characters of the movie were doing, correlated to the spoken phrases, but also by the fact that I knew a couple of languages that had many similarities with the language of the movies that I was watching.

In any case, the amount of the spoken language to which I had been exposed for a year or so, until becoming fluent in it, had been many orders of magnitudes less than what is used by a LLM for training.

I do not know whether any innate knowledge of some grammar was involved, but certainly the knowledge of the grammar of other languages had helped tremendously in reducing the need for being exposed to greater amounts of text, because after seeing only a few examples I could guess the generally-applicable grammar rules.

There is no doubt that the way by which a LLM learns is much dumber than how a human learns, which is why this must be compensated by a much bigger amount of training data.

Seeing how the current inefficiency of LLM training has already caused serious problems for a great number of people, who either had to give up on buying various kinds of electronic devices or they had to accept to buy devices of a much worse quality than previously desired and planned, because the prices for DRAM modules and for big SSDs have skyrocketed, due to the hoarding of memory devices by the rich who hope to become richer by using LLMs, I believe that it has been proven beyond doubt that the way how LLMs learn, for now, is not good enough and it is certainly not a positive achievement, as more people have been hurt by it than the people who have benefited from it.

alt Hacker News

Replies