logoalt Hacker News

red75primelast Sunday at 10:27 PM0 repliesview on HN

> I don't think there's any fundamental difference in the principle of their operation

Yeah, they seem to be a subject to the universal approximation theorem (it needs to be checked more thoroughly, but I think we can build a transformer that is equivalent to any given fully-connected multilayered network).

That is at a certain size they can do anything a human can do at a certain point in their life (that is with no additional training) regardless of whether humans have world models and what those model are on the neuronal level.

But there are additional nuances that are related to their architectures and training regimes. And practical questions of the required size.