logoalt Hacker News

hexagatoday at 4:35 PM0 repliesview on HN

>> But we don't usually believe in both the bullshit and in the fact the the BS is actually BS.

> I can't parse what you mean by this.

The point is that humans care about the state of a distributed shared world model and use language to perform partial updates to it according to their preferences about that state.

Humans who prefer one state (the earth is flat) do not -- as a rule -- use language to undermine it. Flat earthers don't tell you all the reasons the earth cannot be flat.

But even further than this, humans also have complex meta-preferences of the state, and their use of language reflects those too. Your example is relevant here:

> My dad absolutely insisted that the water draining in toilets or sinks are meaningfully influenced by the Coriolis effect [...]

> [...] should have been able to figure out from first principles why the Coriolis effect is exactly zero on the equator itself, didn't.

This is an exemplar of human behavior. Humans act like this. LLMs don't. If your dad did figure out from first principles and expressed it and continued insisting the position, I would suspect them of being an LLM, because that's how LLMs 'communicate'.

Now that the what is clear -- why? Humans experience social missteps like that as part of the loss surface. Being caught in a lie sucks, so people learn to not lie or be better at it. That and a million other tiny aspects of how humans use language in an overarching social context.

The loss surface that LLMs see doesn't have that feedback except in the long tail of doing Regularized General Document Corpora prediction perfectly. But it's so far away compared to just training on the social signal, where honesty is immediately available as a solution and is established very early in training instead of at the limit of low loss.

How humans learn (embedded in a social context from day one) is very effective at teaching foundational abilities fast. Natural selection cooked hard. LLM training recipes do not compare, they're just worse in so many different ways.