Opus 4.5 catches all sorts of things a linter would not, and with little manual prompting at that. Missing DB indexes, forgotten migration scenarios, inconsistencies with similar services, an overlooked edge case.
Now I'm getting a robot to review the branch at regular intervals and poking holes in my thinking. The trick is not to use an LLM as a confirmation machine.
It doesn't replace a human reviewer.
I don't see the point of paying for yet another CI integration doing LLM code review.
I came to the same conclusion and ended up wiring a custom pipeline with LangGraph and Celery. The markup on the SaaS options is hard to justify given the raw API costs. The main benefit of rolling it yourself seems to be the control over context retrieval—I can force it to look at specific Postgres schemas or related service definitions that a generic CI integration usually misses.
All those llm wrapper companies make no sense.
Currently attempting to get GitLab Duo's review featured enabled as a 'second pair of eyes'. I agree 100% that it's not replacing a human review.
I would on the whole prefer a 'lint-style' tool to catch most stuff because they don't hallucinate.
But obviously they don't catch everything so an LLM-based review seems like an additional useful tool.
You’ve found the smoking gun!
Exactly. This is like buying a smoothie blender when you already have an all-purpose mixer-blender. This whole space is at best an open-source project, not a (multiple!) whole company.
It's very unlikely that any of these tools are getting better results than simply prompting verbatim "review these code changes" in your branch with the SOTA model du jour.