The meaning of tokens lose touch with language in the deeper layers of large language model’s neural...

wolttam • today at 1:39 PM • 1 reply • view on HN

The meaning of tokens lose touch with language in the deeper layers of large language model’s neural nets.

Language is just the input/output modality.

Replies

I'll admit I am not an expert in the field, but the fact that "chain-of-thought" optimisations function by getting the model to extend its own context window with more language to me hints that what we consider an "intelligent" response is ultimately contingent of the language processing.

In any case though, if language is just the input/output modality, where is the intelligence when language is not involved? Is the "intelligence" of ChatGPT/Claude/Gemini models dependent on the human-decision-curated linguistic dataset they have been trained upon, or is it prior to that? If a SOA LLM were to be trained on the same dataset as them but was not in any way put through RLHF for it to respond to human prompts, would it be intelligent? What would be the expression of that intelligence?

➕ show 1 reply

alt Hacker News

Replies