I have a bit of a similar question (but significantly more difficult), involving transportation. To me it really seems that a lot of the models are trained to have a anti-car and anti-driving bias, to the point that it hinders the models ability to reason correctly or make correct answers.
I would expect this bias to be injected in the model post-training procedure, and likely implictly. Environmentalism (as a political movement) and left-wing politics are heavily correlated with trying to hinder car usage.
Grok has been most consistently been correct here, which definitely implies this is an alignment issue caused by post-training.
Yes Grok gets it right even when told to not use web search. But the answer I got from the fast model is nonsensical. It recommends to drive because you'd not save any time walking and because "you'd have to walk back wet". The thinking-fast model gets it correct for the right reasons every time. Chain of thought really helps in this case.
Interestingly, Gemini also gets it right. It seems to be better able to pick up on the fact it's a trick question.
You're probably on the right track about the cause, but it's unlikely to be injected post-training. I'd expect post-training to help improve the situation. The problem starts with the training set. If you just train an LLM on the internet you get extreme far left models. This problem has been talked about by all the major labs. Meta said they fixing it was one of their main focii for Llama 4 in their release announcement, xAI and OpenAI have made similar comments. Probably xAI team have just done a lot more to clean the data set.
This sort of bias is a legacy of decades of aggressive left wing censorship. Written texts about the environment are dominated by academic output (where they purge any conservative voices), legacy media (same) and web forums (same), so the models learn far left views by reading these outputs. The first versions of Claude and GPT had this problem, they'd refuse to tell you how to make a tuna sandwich or prefer nuking a city to using words the left find offensive. Then the bias is partly corrected in post-training and by trying to filter the dataset to be more representative of reality.
Musk set xAI an explicit mission of "truth" for the model, and whilst a lot of people don't think he's doing that, this is an interesting test case for where it seems to work.
Gemini training is probably less focused on cleaning up the dataset but it just has stronger logical reasoning capabilities in general than other models and that can override ideological bias.