logoalt Hacker News

anon29111/07/20241 replyview on HN

In the paper 'Hopfield networks are all you need', they calculate the total number of things able to be 'stored' in the attention layers, and it's exponential in the number of parameters. So essentially, you can store more 'ideas' in an LLM than there are particles in the universe. I think we'll be good.

From a technical perspective, this is due to the softmax activation function that causes high degrees of separation between memory points.


Replies

Jerrrrrrry11/07/2024

   > So essentially, you can store more 'ideas' in an LLM than there are particles in the universe. I think we'll be good.
If it can compress humanities knowledge corpus to <80gb unquanti-optimized, I think between my ironically typo'd double negative, and your seemingly genuine confirmation, to be absolute confirmation:

we are fukt