logoalt Hacker News

AIorNot01/21/20251 replyview on HN

Same concept in LLMs as referenced in this video by Chris Olah at Anthropic:

https://www.reddit.com/r/OpenAI/comments/1grxo1c/anthropics_...

also see: https://distill.pub/2021/multimodal-neurons/


Replies

aithrowawaycomm01/21/2025

The authors of the second piece specifically said this was not the same thing: the fact that they weakly fire for loosely-associated concepts is very different from (and ultimately shallower than) concept neurons:

  Looking to neuroscience, they might sound like “grandmother neurons,” but their associative nature distinguishes them from how many neuroscientists interpret that term. The term “concept neurons” has sometimes been used to describe biological neurons with similar properties, but this framing might encourage people to overinterpret these artificial neurons. Instead, the authors generally think of these neurons as being something like the visual version of a topic feature, activating for features we might expect to be similar in a word embedding.
The "turtle+PhD" artificial neuron is a good example of this distinction: it is just pulling together loosely-related concepts of turtles and academia into one loose neuron, without actually being a coherent concept.