logoalt Hacker News

Draikenyesterday at 11:51 AM1 replyview on HN

Yet this is not reproducible. This is the whole issue with LLMs: they are random.

You cannot trust that it'll do a good job on all reports so you'll have to manually review the LLMs reports anyways or hope that real issues didn't get false-negatives or fake ones got false-positives.

This is what I've seen most LLM proponents do: they gloss over the issues and tell everyone it's all fine. Who cares about the details? They don't review the gigantic pile of slop code/answers/results they generate. They skim and say YOLO. Worked for my narrow set of anecdotal tests, so it must work for everything!

IIRC DOGE did something like this to analyze government jobs that were needed or not and then fired people based on that. Guess how good the result was?

This is a very similar scenario: make some judgement call based on a small set of data. It absolutely sucks at it. And I'm not even going to get into the issue of liability which is another can of worms.


Replies

colechristensenyesterday at 7:36 PM

Is it not reproducable? Someone up thread reproduced it and expanded on it. It worked for me the first time I prompted. Did you try it or are you just guessing that it's not reproducable because that's what you already think?

I'm not talking about completely replacing humans, the goal of this exercise was demonstrating how to use an LLM to filter out garbage. Low quality semi-anonymous reports don't deserve a whole lot of accuracy and being conservative and rejecting most reports even when you throw out legitimate ones is fine.

You seem like regardless of evidence presented, your prejudices will lead you to the same conclusions, so what's the point discussing anything? I looked for, found, and shared evidence, you're sharing your opinion.

>IIRC DOGE did something like this to analyze government jobs that were needed or not and then fired people based on that. Guess how good the result was?

I'm talking about filtering spammy communication channels, that has nothing like the care required in making employment decisions.

Your comment is plainly just bad faith and prejudice.