logoalt Hacker News

FloorEggyesterday at 5:03 PM1 replyview on HN

LLMs embody the free-energy principle computationally. They maintain an internal generative model of language and continually minimize “surprise”, the difference between predicted and actual tokens, during both training and infeence. In Friston’s terms, their parameters encode beliefs about the causes of linguistic input; forward passes generate predictions, and backpropagation adjusts internal states to reduce prediction error, just as perception updates beliefs to minimize free energy. During inference, autoregressive generation can be viewed as active inference: each new token selection aims to bring predicted sensory input (the next word) into alignment with the model’s expectations. In a broader sense, LLMs exemplify how a self-organizing system stabilizes itself in a high-dimensional environment by constantly reducing uncertainty about its inputs, a synthetic analogue of biological systems minimizing free energy to preserve their structural and informational coherence.


Replies

procaryoteyesterday at 7:08 PM

You might have lost me but what you're describing doesn't sound like an LLM. E.g:

> each new token selection aims to bring predicted sensory input (the next word) into alignment with the model’s expectations.

what does that mean? An llm generates the next word based on what best matches its training, with some level of randomisation. Then it does it all again. It's not a percepual process trying to infer a reality from sensor data or anything

show 1 reply