logoalt Hacker News

bensyversontoday at 3:40 PM12 repliesview on HN

I get the frustration, but it's reductive to just call LLMs "bullshit machines" as if the models are not improving. The current flagship models are not perfect, but if you use GPT-2 for a few minutes, it's incredible how much the industry has progressed in seven years.

It's true that people don't have a good intuitive sense of what the models are good or bad at (see: counting the Rs in "strawberry"), but this is more a human limitation than a fundamental problem with the technology.


Replies

the_snoozetoday at 3:49 PM

Two things can be true at the same time: The technology has improved, and the technology in its current state still isn't fit for purpose.

I stress test commercially deployed LLMs like Gemini and Claude with trivial tasks: sports trivia, fixing recipes, explaining board game rules, etc. It works well like 95% of the time. That's fine for inconsequential things. But you'd have to be deeply irresponsible to accept that kind of error rate on things that actually matter.

The most intellectually honest way to evaluate these things is how they behave now on real tasks. Not with some unfalsifiable appeal to the future of "oh, they'll fix it."

show 6 replies
Arainachtoday at 3:48 PM

Whether LLMs can create correct content doesn't matter. We've already seen how they are being used and will be used.

Fake content and lies. To drive outrage. To influence elections. To distract from real crimes. To overload everyone so they're too tired to fight or to understand. To weaken the concept that anything's true so that you can say anything. Because who cares if the world dies as long as you made lots of money on the way.

show 1 reply
gdullitoday at 4:01 PM

Computer graphics have been improving for decades but the uncanny valley remains undefeated. I don't know why anyone expects a breakthrough in other areas. There's a wall we hit and we don't understand our own consciousness and effectiveness well enough to replicate it.

show 2 replies
zdragnartoday at 3:48 PM

That's not why the author calls them bullshit machines.

> One way to understand an LLM is as an improv machine. It takes a stream of tokens, like a conversation, and says “yes, and then…” This yes-and behavior is why some people call LLMs bullshit machines. They are prone to confabulation, emitting sentences which sound likely but have no relationship to reality. They treat sarcasm and fantasy credulously, misunderstand context clues, and tell people to put glue on pizza.

Yes, there have been improvements on them, but none of those improvements mitigate the core flaw of the technology. The author even acknowledges all of the improvements in the last few months.

p_stuart82today at 4:16 PM

models are improving. the pricing already assumes they're ready for prod. that's where the fires start

karmakazetoday at 4:13 PM

Bullshit is the perfect term here, even as AI's get so much better and capable Brandolini's Law aka the "bullshit asymmetry principle" always applies--the energy required to refute misinformation is an order of magnitude larger than that needed to produce it. Even to use AIs effectively today requires a very good BS detector--some day in the future it won't.

mcpar-landtoday at 4:13 PM

it's not a bullshit machine because its output is bad, it's a bullshit machine because its output is literally 'bullshit' as in, output that is statistically likely but with no factual or reasoning basis. as the models have improved, their bullshit is more statistically likely to sound coherent (maybe even more likely to be 'accurate'), but no more factual and with no more reasoning.

show 1 reply
ura_yukimitsutoday at 4:14 PM

Calling LLMs "bullshit machines" is a reference to a 2024 paper [1] which itself uses the concept of "bullshit" as defined in the essay/book "On Bullshit" by Harry G. Frankfurt [2]. The TL;DR is that LLMs are fundamentally bullshit machines because they are only made to generate sentences that sound plausible, but plausible does not always mean true.

[1]: https://link.springer.com/article/10.1007/s10676-024-09775-5

[2]: https://en.wikipedia.org/wiki/On_Bullshit

4ndrewltoday at 4:06 PM

It doesn't matter how good the models become. They can only deal in bullshit, in the academic use of the term.

Scaevolustoday at 3:49 PM

They are bullshit machines because they do not have an internal mental model of truth like a human does. The flagship models bullshit less, but their fundamental architectures prevent having truth interfere with output.

https://philosophersmag.com/large-language-models-and-the-co...

show 1 reply
ajrosstoday at 3:44 PM

> it's reductive to just call LLMs "bullshit machines" as if the models are not improving

This is true, but I prefer to think of it as "It's delusional to pretend as if human beings are not bullshit machines too".

Lies are all we have. Our internal monologue is almost 100% fantasy. Even in serious pursuits, that's how it works. We make shit up and lie to ourselves, and then only later apply our hard-earned[1] skill prompts to figure out whether or not we're right about it.

How many times have the nerds here been thinking through a great new idea for a design and how clever it would be before stopping to realize "Oh wait, that won't work because of XXX, which I forgot". That's a hallucination right there!

[1] Decades of education!

show 5 replies