There is a lot more context to the outrage which is missing from your analysis. People have multiple...

runarberg • today at 3:06 AM • 0 replies • view on HN

There is a lot more context to the outrage which is missing from your analysis. People have multiple reasons to be mad at AI usage, you mention some of them in your introduction, and you put a (statistically insignificant) measure on only one of them. In your analysis you have shown that exactly one of these reasons is anecdotal. That does not mean they are wrong, and it especially does not mean they are unjustified.

That you found a single pre-AI release which did not cause outrage is proof of nothing. This single release is equally anecdotal, and statistically insignificant.

So, the biggest context that is missing here is that people hate AI for various reasons, and they don‘t want their favorite tools to fall victim to AI for equally many reasons. It is only natural that people who hate AI react this way when they find out their favorite tool uses AI, and doubly so when they sniff correlation between their favorite tools use of AI and bugs.

> I'm just trying to say that these specific releases are unremarkable, and there's no evidence at all of harm currently.

Well, there is no evidence against harm either. But what you did here is a bit of a slight of hand. In your analysis your null hypothesis is: “There is no difference in bug count between releases which includes code commits from Claude Code and releases which don‘t”. (You then go about doing what every psychology major is taught not to do; find evidence for the null hypothesis, not against it). However what hypothesis testing is for is to use a representative sample to generalize over a wider population. You do hypothesis testing because you want to demonstrate that your sample is representative of a wider population, that you just so happened to have picked the two sample, by random chance, which shows the effect regardless of the experiment.

By calculating the p-values you were telling me that you were in fact ready to make generalizing statements over a wider population of commits, but your results were statically insignificant, so really you should not draw any conclusions from them. You have not, in fact, shown that they aren’t different from the rest of the population.

alt Hacker News