We do see fewer invalid JSONs on latest bigger LLMs but still can happen on smaller and cheaper models. There is also case when input is truncated or a required field not found, which are inherently difficult.
On XML vs JSON, I think the goal here is to generate typed output where JSON with zod shines - for example the result can type check and be inserted to database typed columns later
Thing is even with XML LLM will fail every now and then.
I've built an agent in both tool calling and by parsing XML
You always need a self correcting loop built in, if you are editing a file with LLM you need provide hints so LLM gets it right the second time or 3rd or n time.
Just by switching to XML you'll not get that.
I used to use XML now i only use it for examples in in system prompt for model to learn. That's all