logoalt Hacker News

oofbeyyesterday at 4:27 PM1 replyview on HN

Oh I’ve had agents remove tests plenty of times. Or cripple the tests so they pass but are useless - more common and harder to prompt against.


Replies

maddmannyesterday at 7:11 PM

Ah true, that also can happen — in aggregate I think models will tend to expand codebases versus contract. Though, this is anecdotal and probably is something ai labs and coding agent companies are looking at now.

show 1 reply