Okay, so take the sandwich. There is no way to know what is in it by looking at it. No amount of optimisation will fix this.
I'm sure one could produce a CV model that was a lot better at guessing here than these LLMs are, but fundamentally it is still guessing.