1m context in OpenAI and Gemini is just marketing. Opus is the only model to provide real usable bug context.
Source? I ask because I use 500k+ context on these on a daily basis.
Big refactorings guided by automated tests eat context window for breakfast.
Codex high reasoning has been a legitimately excellent tool for generating feedback on every plan Claude opus thinking has created for me.
I'm directly conveying my actual experience to you. I have tasks that fill up Opus context very quickly (at the 200k context) and which took MUCH longer to fill up Codex since 5.2 (which I think had 400k context at the time).
This is direct comparison. I spent months subscribed to both of their $200/mo plans. I would try both and Opus always filled up fast while Codex continued working great. It's also direct experience that Codex continues working great post-compaction since 5.2.
I don't know about Gemini but you're just wrong about Codex. And I say this as someone who hates reporting these facts because I'd like people to stop giving OpenAI money.