You are using it wrong, or are using a weak model if your failure rate is over 50%. My experience is nothing like this. It very consistently works for me. Maybe there is a <5% chance it takes the wrong approach, but you can quickly steer it in the right direction.
you are using it on easy questions. some of us are not.