I've only used 5.4 for 1 prompt (edit: 3@high now) so far (reasoning: extra high, took really long), and it was to analyse my codebase and write an evaluation on a topic. But I found its writing and analysis thoughtful, precise, and surprisingly clearly written, unlike 5.3-Codex. It feels very lucid and uses human phrasing.
It might be my AGENTS.md requiring clearer, simpler language, but at least 5.4's doing a good job of following the guidelines. 5.3-Codex wasn't so great at simple, clear writing.
5.4 very high didn't notice in my codebase a glaring issue that drops all data being sent around the network.
That's been my experience as well switching from Opus to Codex. Reasoning takes longer but answers are precise. Claude is sloppy in comparison.
> It might be my AGENTS.md requiring clearer, simpler language
If you gave the exact same markdown file to me and I posted ed the exact same prompts as you, would I get the same results?
The latest research these days is that including an AGENTS.md file only makes outcomes worse with frontier models.
Honestly, while I'd like to believe you, there's always a post about how $MODEL+1 delivered powerful insights about the very nature of the universe in precise Hegelian dialectic, while $MODEL's output was indistinguishable from a pack of screeching sexually frustrated bonobos