logoalt Hacker News

hliyantoday at 12:02 PM2 repliesview on HN

In early 2023, I remember someone breathlessly explaining that there are signs that LLMs that are seemingly good at chess/checkers moves may have a rudimentary model of the board within them, somehow magically encoded into the model weights through the training. I was stupid enough to briefly entertain the possibility until I actually bothered to develop a high level understanding of the transformer architecture. It's surprising how much mysticism this field seems to attract. Perhaps it being a non-deterministic, linguistically invoked black box, triggers the same internal impulses that draw some people to magic and spellcasting.


Replies

pegasustoday at 12:47 PM

Just because it's not that hard to reach a high-level understanding of the transformer pipeline doesn't mean we understand how these systems function, or that there can be no form of world model that they are developing. Recently there has been more evidence for that particular idea [1]. The feats of apparent intelligence LLMs sometimes display have taken even their creators by surprise. Sure, there's a lot of hype too, that's part and parcel of any new technology today, but we are far from understanding what makes them perform so well. In that sense, yeah you could say they are a bit "magical".

[1] https://the-decoder.com/new-othello-experiment-supports-the-...

show 1 reply
momojotoday at 4:43 PM

I'm not a fan of mysticism. I'm also with you that these are simply statistical machines. But I don't understand what happened when understood transformers at a high-level.

If you're saying the magic disappeared after looking at a single transformer, did the magic of human intelligence disappear after you understood human neurons at a high level?