I'd guess the same has always been true for READMEs / human dev docs. Of course it doesn't transfer directly but still feels incredible to be in an age where we can measure such (previously) theoretical things with synthetic programmers.
Yeah isn't this is obvious? Bad docs create triple work: you do it wrong (1) you figure out it's not working because the doc is wrong (2) you do it the right way (3). Between 2 and 3 is figuring out what the right way is, which a good doc ideally shortcuts.
But obviously if you tell somebody "make a boiled egg. To boil an egg you have to crack it into the pan first." That's a lot worse than "make a boiled egg." Especially when you have an infinitely trusting, 0 common sense executor like an agentic model.
Yeah isn't this is obvious? Bad docs create triple work: you do it wrong (1) you figure out it's not working because the doc is wrong (2) you do it the right way (3). Between 2 and 3 is figuring out what the right way is, which a good doc ideally shortcuts.
But obviously if you tell somebody "make a boiled egg. To boil an egg you have to crack it into the pan first." That's a lot worse than "make a boiled egg." Especially when you have an infinitely trusting, 0 common sense executor like an agentic model.