I definitely agree that ChatGPT can be incorrect. I’ve seen that myself. In my experience, though, it’s more often right than wrong.
So when you say “in nearly every question on complex topics", I’m curious what specific examples you’re seeing.
Would you be open to sharing a concrete example?
Specifically: the question you asked, the part of the answer you know is wrong, and what the correct answer should be.
I have a hypothesis (not a claim) that some of these failures you are seeing might be prompt-sensitive, and I’d be curious to try it as a small experiment if you’re willing.