logoalt Hacker News

ElFitztoday at 6:32 AM0 repliesview on HN

I’ve been making Codex and Claude get their work reviewed by most recent best performing model of their own family, and each other’s, for months.

On top of that, we have been running multi-model AI reviews on every PR through their respective GitHub integrations (Codex, Gemini, Copilot, Greptile, CodeRabbit).

They never fully overlap, and yet they somehow usually all miss the same things. The most significant improvement came from having agents commit their plan along with their work.

On the upside, it means I get to focus my reviews on different things.