I am moderately anti-AI, but I don't understand the purpose of feeding them trick questions and watching them fail. Looks like the "gullibility" might be a feature - as it is supposed to be helpful to a user who genuinely wants it to be useful, not fight against a user. You could probably train or maybe even prompt an existing LLM to always question the prompt, but it would become very difficult to steer it.
But this one isn't like the "How many r's in strawberry" one: The failure mode, where it misses a key requirement for success, is exactly the kind of failure mode that could make it spend millions of tokens building something which is completely useless.
That said, I saw the title before I realized this was an LLM thing, and was confused: assuming it was a genuine question, then the question becomes, "Should I get it washed there or wash it at home", and then the "wash it at home" option implies picking up supplies; but that doesn't quite work.
But as others have said -- this sort of confusion is pretty obvious, but a huge amount of our communication has these sorts of confusions in them; and identifying them is one of the key activities of knowledge work.