If an LLM were to struggle to closely follow instructions that weren't wrapped in XML, I would strongly consider it a sign of a poor model reflecting poor model training.