logoalt Hacker News

roenxiyesterday at 11:57 AM1 replyview on HN

If we had a theoretical technique to identify the true and objective reality we'd use it in the courts and laboritories. There is no such technique, but what we do have is 2 techniques that seem work:

1) Has a certain standard of evidence been met?

2) Are the related arguments free of logical inconsistencies?

We can train the LLMs to do 2, and maybe even 1 to some extent (exactly what quality of evidence a computer can practically gather is limited). But that isn't going to get rid of hallucinations, for the same reason courts are hit-and-miss or the conclusions of studies often aren't very reliable. These techniques help, but sometimes they still get people to say things that, on close inspection, turn out to be nonsense. And those best-effort approaches are too much to expect for most questions an LLM will be handed which are informal, low stakes and don't need strong supporting evidence or logical rigour.

I think it is underestimated how many LLM-style hallucinations people themselves have. It just isn't obvious because most humans have a strategy of only repeating what the herd says after it has been socially vetted, which makes their individual eccentricities less obvious.

TLDR; I don't think it looks like an easy problem for RLVR, it looks technically unsolvable. Even making progress requires a philosophical breakthrough on the nature of truth so that the objective function can be established.


Replies

stalfieyesterday at 2:09 PM

Well, I'd argue that this depends on the field you're investigating. Sometimes you have a way to identify objective reality and sometimes you don't. In mathematics the majority of the field is verifiable in this way. Coding a bit less as it's intersubjective, as and the ideal methodology is subject to taste.

But even in muddy fields of reality like medicine, there are objective facts to be found. When someone comes into an ER with chest pain, you often find a true, undeniable reason for why that is happening. If their lung has collapsed, a coronary artery is clogged or the aortic artery is dissecting, even if you don't find that out it tends to be clear in retrospect. The area of reality that becomes muddy is when use proxy signals to try to figure out who gets promoted to expensive/harmful examinations we can make final conclusions from, or the cases that don't fit cleanly into one bucket or the other. But very often, the gold standard truly is golden.

Of course, many realms of reality cannot be verified in this way. But I'd argue that there are quite a few that can.

show 1 reply