Different Language Models Learn Similar Number Representations

73 points • by Anon84 • today at 2:34 PM • 31 comments • view on HN

Comments

Title is editorialized and needs to be fixed; the paper does not say what this title implies, nor is that the title of the paper.

➕ show 2 replies

ACCount37 • today at 3:30 PM

The "platonic representation hypothesis" crowd can't stop winning.

Potentially useful for things like innate mathematical operation primitives. A major part of what makes it hard to imbue LLMs with better circuits is that we don't know how to connect them to the model internally, in a way that the model can learn to leverage.

Having an "in" on broadly compatible representations might make things like this easier to pull off.

➕ show 4 replies

zjp • today at 6:15 PM

Different models, similar number representations. Different models for different languages, similar concept representations. They have to learn all of this from human text input, so they're not divining it themselves. It all makes a strong case for universal grammar, IMO.

➕ show 2 replies

matja • today at 3:36 PM

The eigenvalue distribution looks somewhat similar to Benford's Law - isn't that expected for a human-curated corpus?

➕ show 1 reply

jdonaldson • today at 3:40 PM

(Pardon the self promotion) Libraries like turnstyle are taking advantage of shared representation across models. Neurosymbolic programming : https://github.com/jdonaldson/turnstyle

dboreham • today at 3:33 PM

It's going to turn out that emergent states that are the same or similar in different learning systems fed roughly the same training data will be very common. Also predict it will explain much of what people today call "instinct" in animals (and the related behaviors in humans).

➕ show 2 replies

gn_central • today at 3:15 PM

Curious if this similarity comes more from the training data or the model architecture itself. Did they look into that?

➕ show 1 reply

fmbb • today at 5:37 PM

> Language models trained on natural text learn to represent numbers using periodic features with dominant periods at T=2,5,10.

This proves a decimal system is correct. Base twelve numeral systems are clearly unnatural and inefficient.

➕ show 2 replies

alt Hacker News

Different Language Models Learn Similar Number Representations

Comments