logoalt Hacker News

adastra22today at 1:52 AM4 repliesview on HN

They do know what they don't know. There's a probability distribution for outputs that they are sampling from. That just isn't being used for that purpose.


Replies

chonglitoday at 3:35 AM

Having a probability distribution to sample from is not the same thing is knowing, because they don’t know anything about the provenance of the data that was used to build the distribution. They trust their training set implicitly by construction. They have no means to detect systematic errors in their training set.

show 1 reply
dhampitoday at 2:45 AM

Well, with thinking models, it’s not that simple. The probability distribution is next token. But if a model thinks to produce an answer, you can have a high confidence next token even if MCMC sampling the model’s thinking chain would reveal that the real probability distribution had low confidence.

raddantoday at 2:20 AM

I’m not clear what you mean by “know.” If you mean “the information is in the model” then I mostly agree, distributional information is represented somewhere. But if you mean that a model can actually access this information in a meaningful and accurate way—say, to state its confidence level—I don’t think that’s true. There is a stochastic process sampling from those distributions, but can the process introspect? That would be a very surprising capability.

show 1 reply
Isamutoday at 2:08 AM

Oh, you mean somewhere it is tracking the statistical likelihood of the output. Yeah I buy that, although I think it just tends towards the most likely output given the context that it is dragging along. I mean it wouldn’t deliberately choose something really statistically unlikely, that’s like a non sequitur.

show 2 replies