They are not mining on the wrong head. They are mining on the current head. If they find a block it will be accepted as the new head and the withheld block will be rejected, so it's not wasted mining time at all.
You can determine statistically whether you have found a block relatively early, and conversely whether other miners are unlikely to find one soon.
So you can get a head start on the next block from the likely new head block you've found.
It only works on average of course, you might be the one wasting resources if someone else published a block while you're withholding yours, but the trick is for you to gain an edge on average.
Now what happens if everyone is doing that calculation? That's where you need to do the game theory analysis (which I haven't and don't claim to understand).
Not an expert, but I have two thoughts:
1. They don't have to wait until another miner finds a block, they can just wait "for some time" and then release their block. All that time gives them the edge for the next block.
2. My understanding is that if two different blocks are found concurrently for the same head, then the network waits for the next block to select which "new head" is accepted. I.e. when there are competing chains, the longer chain wins. So I could imagine that a strategy could be to wait until some other miner announces their block and release yours precisely at that time, hence creating two competing chains. But you presumably have an edge because you have already been mining for a while on top of your block.