Seems like this is an aspect of their well-known overconfidence and the inability to self-reflect an...

Sharlin • 05/15/2025 • 8 replies • view on HN

Seems like this is an aspect of their well-known overconfidence and the inability to self-reflect and recognize they have to ask for more details because their priors are too low. If you look at the output of reasoning models, it’s clear that the idea of asking for clarification very rarely occurs to them – when they’re confused, it’s just endless speculation of what the user might have meant.

This, of course, has certain implications as to the wisdom of the idea of “replacing human programmers”, given that one of the hard parts of the trade is trying to turn vague and often confused ideas into precise specifications by interacting with the shareholders.

Replies

Terr_ • 05/15/2025

> inability to self-reflect

IMO the One Weird Trick for LLMs is recognizing that there's no real entity, and that users are being tricked into a suspended-disbelief story.

In most cases cases you're contributing text-lines for a User-character in a movie-script document, and the LLM algorithm is periodically triggered to autocomplete incomplete lines for a Chatbot character.

You can have an interview with a vampire DraculaBot, but that character can only "self-reflect" in the same shallow/fictional way that it can "thirst for blood" or "turn into a cloud of bats."

➕ show 2 replies

bytepoet • 05/15/2025

The inability of LLMs of ask for clarification was exactly the flaw we encountered when testing them on open-ended problems, stated somewhat ambiguously. This was in the context of paradoxical situations, tested on DeepSeek-R1 and Claude-3.7-Sonnet. Blog post about our experiments: https://pankajpansari.github.io/posts/paradoxes/

veunes • 05/15/2025

Real programmers spend a ton of time just figuring out what people actually want. LLMs still treat guessing as a feature

➕ show 1 reply

arkh • 05/16/2025

> Seems like this is an aspect of their well-known overconfidence and the inability to self-reflect and recognize they have to ask for more details because their priors are too low.

When I read this I feel like I'm witnessing intelligent people get fooled by a better Emacs doctor. It is not reflecting, it is not confident. It is "just" proposing text completion. That is why once the completion starts being bad you have to start anew. It does not have any concept of anything just a huge blob of words and possible follow-up from what the texts used to train it show.

btbuildem • 05/15/2025

> This, of course, has certain implications as to the wisdom of the idea of “replacing human programmers”

Ironically, working with a junior dev is a lot like this -- setting them on a task, then coming back later with dogs and flashlights to retrieve them from the deep woods they've inevitably lost themselves in by just forging ahead, making assumptions, and asking no questions.

bobsyourbuncle • 05/15/2025

Isn’t this relatively trivial to correct? Just like chain of thought reasoning replaces end tokens with “hmm” to continue the thought can’t users just replace the llm tokens whenever it starts saying “maybe they are referring to” with something like. “Let me ask a clarifying question before I proceed.”

➕ show 1 reply

voidspark • 05/15/2025

> inability to self-reflect and recognize they have to ask for more details because their priors are too low.

Gemini 2.5 Pro and ChatGPT-o3 have often asked me to provide additional details before doing a requested task. Gemini sometimes comes up with multiple options and requests my input before doing the task.

➕ show 2 replies

petesergeant • 05/15/2025

> and the inability to self-reflect and recognize they have to ask for more details

They're great at both tasks, you just have to ask them to do it.

➕ show 1 reply

alt Hacker News

Replies