logoalt Hacker News

richk449last Saturday at 5:22 PM10 repliesview on HN

> if these tools are close to having super-human intelligence, and they make humans so much more productive, why aren't we seeing improvements at a much faster rate than we are now? Why aren't inherent problems like hallucination already solved, or at least less of an issue? Surely the smartest researchers and engineers money can buy would be dogfooding, no?

Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

As far as I can tell smart engineers are using AI tools, particularly people doing coding, but even non-coding roles.

The criticism feels about three years out of date.


Replies

imiriclast Saturday at 5:54 PM

Not at all. The reason it's not talked about as much these days is because the prevailing way to work around it is by using "agents". I.e. by continuously prompting the LLM in a loop until it happens to generate the correct response. This brute force approach is hardly a solution, especially in fields that don't have a quick way of verifying the output. In programming, trying to compile the code can catch many (but definitely not all) issues. In other science and humanities fields this is just not possible, and verifying the output is much more labor intensive.

The other reason is because the primary focus of the last 3 years has been scaling the data and hardware up, with a bunch of (much needed) engineering around it. This has produced better results, but it can't sustain the AGI promises for much longer. The industry can only survive on shiny value added services and smoke and mirrors for so long.

show 1 reply
natebclast Saturday at 7:01 PM

> Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

Last week I had Claude and ChatGPT both tell me different non-existent options to migrate a virtual machine from vmware to hyperv.

Week before that one of them (don't remember which, honestly) gave me non existent options for fio.

Both of these are things that the first party documentation or man page has correct but i was being lazy and was trying to save time or be more efficient like these things are supposed to do for us. Not so much.

Hallucinations are still a problem.

nunezlast Saturday at 8:51 PM

The few times I've used Google to search for something (Kagi is amazing!), it's Gemini Assistant at the top fabricated something insanely wrong.

A few days ago, I asked free ChatGPT to tell me the head brewer of a small brewery in Corpus Christi. It told me that the brewery didn't exist, which it did, because we were going there in a few minutes, but after re-prompting it, it gave me some phone number that it found in a business filing. (ChatGPT has been using web search for RAG for some time now.)

Hallucinations are still a massive problem IMO.

show 2 replies
taorminalast Saturday at 8:48 PM

ChatGPT constantly hallucinates. At least once per conversation I attempt to happen with it. We all gave up on bitching about it constantly because we would never talk about anything else, but I have no reason to believe that any LLM has vaguely solved this problem.

amliblast Saturday at 9:25 PM

How can it not be hallucinating anymore if everything the current crop of generative AI algorithm does IS hallucination? What actually happens is that sometimes the hallucinated output is "right", or more precisely, coherent with the user mental model.

majormajorlast Saturday at 8:27 PM

> Hallucination does seem to be much less of an issue now. I hardly even hear about it - like it just faded away.

Nonsense, there is a TON of discussion around how the standard workflow is "have Cursor-or-whatever check the linter and try to run the tests and keep iterating until it gets it right" that is nothing but "work around hallucinations." Functions that don't exist. Lines that don't do what the code would've required them to do. Etc. And yet I still hit cases weekly-at-least, when trying to use these "agents" to do more complex things, where it talks itself into a circle and can't figure it out.

What are you trying to get these things to do, and how are you validating that there are no hallucinations? You hardly ever "hear about it" but ... do you see it? How deeply are you checking for it?

(It's also just old news - a new hallucination is less newsworthy now, we are all so used to it.)

Of course, the internet is full of people claiming that they are using the same tools I am but with multiple factors higher output. Yet I wonder... if this is the case, where is the acceleration in improvement in quality in any of the open source software I use daily? Or where are the new 10x-AI-agent-produced replacements? (Or the closed-source products, for that matter - but there it's harder to track the actual code.) Or is everyone who's doing less-technical, less-intricate work just getting themselves hyped into a tizzy about getting faster generation of basic boilerplate for languages they hadn't personally mastered before?

HexDecOctBinlast Saturday at 11:43 PM

I just tried asking ChatGPT on how to "force PhotoSync to not upload images to a B2 bucket that are already uploaded previously", and all it could do is hallucinate options that don't exist and webpages that are irrelevant. This is with the latest model and all the reasoning and researching applied, and across multiple messages in multiple chats. So no, hallucination is still a huge problem.

leptonslast Saturday at 5:51 PM

Are you hallucinating?? "AI" is still constantly hallucinating. It still writes pointless code that does nothing towards anything I need it to do, a lot more often than is acceptable.

kevinventullolast Saturday at 11:43 PM

You don’t hear about it anymore because it’s not worth talking about anymore. Everyone implicitly understands they are liable to make up nonsense.