> It would be good to specifically measure what Detail alone finds, versus what all the bots toge...

sachiniyer01 • today at 2:33 PM • 0 replies • view on HN

> It would be good to specifically measure what Detail alone finds, versus what all the bots together find. We could look at that as a completely separate measure.

Yea, it does seem like running a combo of review bots is the right meta. It would be pretty interesting to see how Detail stacks up against 3 or 4 review bots all together (for some anec-data though - posthog has 4 review bots and we seem to find important stuff there).

> Second, when the same AI model is doing the judging, it’s being unfair.

Yup, agreed. We use a combo of models rn (not primarily Claude). The two review bots are Codex and Gemini reviewed by Claude.

alt Hacker News