logoalt Hacker News

macintuxtoday at 1:39 AM1 replyview on HN

Would you even know? Serious question. The volume of code the models can produce, the subtle ways these bugs can manifest (or even only manifest when under attack), it seems like they would be easy to overlook.


Replies

CamperBob2today at 2:10 AM

I have a habit of getting GPT 5.5 to review everything Opus writes for me, and vice versa. The model in the reviewer role frequently finds things I overlooked myself. Occasionally in parts of the code I wrote.

No modern LLM has found any buffer overflow bugs in parts of my code that originated from another LLM. Again, though, they have found one or two that were my fault.