logoalt Hacker News

singpolyma3today at 2:56 AM3 repliesview on HN

Next do "why LLMs work"


Replies

sheeshkebabtoday at 3:21 AM

considering they work with any architecture/configuration given enough compute, just more or less efficiently - then maybe it's fundamental, in the same sense as why electricity works...

soupspacestoday at 3:33 AM

Universal approximation theorem, embeddings, self-attention, gradient descent. And empirically, scaling laws.

skydhashtoday at 3:59 AM

Why does linear regression works? Why does computer works? Because it's about math and the encoding information. If we can encode words as numbers, then why can't we encode their order as a relation? It's just that neural networks are very apt at finding that relation even if it's noisy.