logoalt Hacker News

Retricyesterday at 1:37 AM1 replyview on HN

I don’t think it’s an oversimplification as accuracy is what constrains LLMs across so many domains. If you’re a wealthy person asking ChatGPT to write a prenup or other contract to use would be an act of stupidity unless you vetted it with an actual lawyer. My most desired use case is closer, but LLMs are still more than an order of magnitude below what I am willing to tolerate.

IMO that’s what maturity means in AI systems. Self driving cars aren’t limited by the underlying mechanical complexity, it’s all about the long quest for a system to make reasonably correct decisions hundreds of times a second for years across widely varying regions and weather conditions. Individual cruse missiles on the other hand only needed to operate across a single short and pre-mapped flight in specific conditions, therefore they used visual navigation decades earlier.


Replies

adamzwassermanyesterday at 4:08 PM

You're conflating two different questions. I'm not arguing LLMs are mature or reliable enough for high-stakes tasks. My argument is about why they produce output that creates the illusion of understanding in the language domain, while the same techniques applied to other domains (video generation, molecular modeling, etc.) don't produce anything resembling 'understanding' despite comparable or greater effort.

The accuracy problems you're describing actually support my point: LLMs navigate linguistic structures effectively enough to fool people into thinking they understand, but they can't verify their outputs against reality. That's exactly what you'd expect from a system that only has access to the map (language) and not the territory (reality).

show 1 reply