logoalt Hacker News

phiretoday at 1:51 AM4 repliesview on HN

Humans aren't very diligent in the long term. If an LLM does something correctly enough times in a row (or close enough), humans are likely to stop checking its work throughly enough.

This isn't exactly a new problem we do it with any bit of new software/hardware, not just LLMs. We check its work when it's new, and then tend to trust it over time as it proves itself.

But it seems to be hitting us worse with LLMs, as they are less consistent than previous software. And LLM hallucinations are partially dangerous, because they are often plausible enough to pass the sniff test. We just aren't used to handling something this unpredictable.


Replies

Waterluviantoday at 1:56 AM

It’s a core part of the job and there’s simply no excuse for complacency.

show 3 replies
zahlmantoday at 2:17 AM

There's a weird inconsistency among the more pro-AI people that they expect this output to pass as human, but then don't give it the review that an outsourced human would get.

show 1 reply
vidarhtoday at 2:18 AM

The irony is that while from perfect, an LLM-based fact-checking agent is likely to be far more dilligent (but still needs human review as well) by nature of being trivial to ensure it has no memory of having done a long list of them (if you pass e.g. Claude a long list directly in the same context, it is prone to deciding the task is "tedious" and starting to take shortcuts).

But at the same time, doing that makes it even more likely the human in the loop will get sloppy, because there'll be even fewer cases where their input is actually needed.

I'm wondering if you need to start inserting intentional canaries to validate if humans are actually doing sufficiently torough reviews.