> (hallucination caveats notwithstanding) This is a pretty big caveat to the goal of > dev...

squigz • 01/21/2025 • 4 replies • view on HN

> (hallucination caveats notwithstanding)

This is a pretty big caveat to the goal of

> develop a much more meaningful and comprehensive understanding

Which is still my biggest issue with LLMs. The little I use of them, the answers are still confidently wrong a lot of the time. Has this changed?

Replies

setsewerd • 01/21/2025

I use ChatGPT a lot each day for writing and organizing tasks, and summaries/explanations of articles etc.

When dealing with topics I'm familiar with, I've found the hallucinations have dropped substantially in the last few years from GPT2 to GPT3 to GPT4 to 4o, especially when web search is incorporated.

LLMs perform best in this regard when working with existing text that you've fed them (whether via web search or uploaded text/documents). So if you paste the text of a study to start the conversation, it's a pretty safe bet you'll be fine.

If you don't have web search turned on, I'd still avoid treating the chat as a search engine though, because 4o will still get little details wrong here and there, especially for newer or more niche topics that wouldn't be as well-represented in the training data.

tyzoid • 01/21/2025

I've found them to be quite accurate when given enough context data. For ex, feeding it an article into it's context window and asking questions about it. Relying on the LLM's internal trained knowledge state seems to be less reliable.

bloopernova • 01/21/2025

I've found that whatever powers Kagi.com's answer seems to be pretty accurate. It cites articles and other sources.

Trying a share link, hope it works:

https://kagi.com/search?q=what+factors+affect+the+freezing+p...

➕ show 1 reply

jcims • 01/21/2025

I agree in general but the way this has worked for me in practice is that I approach things hierarchically up and down. Any specific hallucinations tend to come out in the wash as the same question is asked from different layers of abstraction.

alt Hacker News

Replies