At $DAYJOB, there's an internal version of this, which I think just uses Claude Code (or similar) under the hood on a checked out copy of the PR.
Then it can run `git diff` to get the diff, like you mentioned, but also query surrounding context, build stuff, run random stuff like `bazel query` to identify dependency chains, etc.
They've put a ton of work into tuning it and it shows, the signal-to-noise ratio is excellent. I can't think of a single time it's left a comment on a PR that wasn't a legitimate issue.
Yeah, it’s exceptionally easy to set this up and we have the same thing. Except the team hasn’t had time to fine tune it, and it shows.