In my “repo os” we have an adversarial agent harness running gpt5.4 for plan and implementation and opus4.6 for review. This was the clear winner in the bake-off when 5.4 came out a couple months ago.
Re-ran the bake-off with 4.7 authoring and… gpt5.4 still clearly winning. Same skills, same prompts, same agents.md.