logoalt Hacker News

Jerrrrrrry11/07/20241 replyview on HN

Once your model and map get larger than the thing it is modeling/mapping, then what?

Let us hope the Pigeonhole principle isn't flawed, else we can find ourselves batteries in the Matrix.


Replies

anon29111/07/2024

In the paper 'Hopfield networks are all you need', they calculate the total number of things able to be 'stored' in the attention layers, and it's exponential in the number of parameters. So essentially, you can store more 'ideas' in an LLM than there are particles in the universe. I think we'll be good.

From a technical perspective, this is due to the softmax activation function that causes high degrees of separation between memory points.

show 1 reply