You have to keep an eye on them, but they don't just make tests pass.
Claude sonnet 4 (this time last year) did do this. It once made simulation if a test script passing. Literally a script that just echoed test names and then said pass.
Claude sonnet 4 (this time last year) did do this. It once made simulation if a test script passing. Literally a script that just echoed test names and then said pass.