logoalt Hacker News

strikingyesterday at 8:07 PM3 repliesview on HN

Even if interpretability of specific models or features within them is an open area of research, the mechanics of how LLMs work to produce results are observable and well-understood, and methods to understand their fundamental limitations are pretty solid these days as well.

Is there anything to be gained from following a line of reasoning that basically says LLMs are incomprehensible, full stop?


Replies

famouswafflesyesterday at 8:56 PM

>Even if interpretability of specific models or features within them is an open area of research, the mechanics of how LLMs work to produce results are observable and well-understood, and methods to understand their fundamental limitations are pretty solid these days as well.

If you train a transformer on (only) lots and lots of addition pairs, i.e '38393 + 79628 = 118021' and nothing else, the transformer will, during training discover an algorithm for addition and employ it in service of predicting the next token, which in this instance would be the sum of two numbers.

We know this because of tedious interpretability research, the very limited problem space and the fact we knew exactly what to look for.

Alright, let's leave addition aside (SOTA LLMs are after all trained on much more) and think about another question. Any other question at all. How about something like:

"Take a capital letter J and a right parenthesis, ). Take the parenthesis, rotate it counterclockwise 90 degrees, and put it on top of the J. What everyday object does that resemble?"

What algorithm does GPT or Gemini or whatever employ to answer this and similar questions correctly ? It's certainly not the one it learnt for addition. Do you Know ? No. Do the creators at Open AI or Google know ? Not at all. Can you or they find out right now ? Also No.

Let's revisit your statement.

"the mechanics of how LLMs work to produce results are observable and well-understood".

Observable, I'll give you that, but how on earth can you look at the above and sincerely call that 'well-understood' ?

show 2 replies
hn_acc1yesterday at 8:23 PM

You can't keep pushing the AI hype train if you consider it just a new type of software / fancy statistical database.

menaerusyesterday at 8:12 PM

Yes, there is - benefit of a doubt.