This is not necessarily a fundamental limitation. It's a consequence of a fine-tuning process where human raters decide how "good" an answer is. They're not rating the flow of the conversation, but looking at how complete / comprehensive the answer to a one-shot question looks like. This selects for walls of overconfident text.
Another thing the vendors are selecting for is safety / PR risk. If an LLM answers to a hobby chemistry question in a matter-of-factly way, that's a disastrous PR headline in the making. If they open with several paragraphs of disclaimers or just refuse to answer, that's a win.