logoalt Hacker News

PunchyHamsteryesterday at 8:18 PM2 repliesview on HN

Let's start with most outright alarming error - the claude statistics are taken out of whole 2 data points


Replies

logicprogyesterday at 8:20 PM

That's sort of the point. There isn't enough data to extrapolate, and yet that's exactly what those outraged about AI were doing, and when you do do the very minimal types of analyses (permutation tests, and looking at distributions, mostly) that are actually valid, safe, standard, and useful to do on such low amounts of date, again, no evidence for the outrage shows up, and the two releases look so normal that it sort of shows no one would've cared if they hadn't known or found out that Claude was involved.

I really think this a much better standard of evidence — limited though it is — to outrage-fueled cherry-picked anecdotes, which is what has been driving this whole thing. If you disagree, and think the outrage should go one when I've shown there's an absence of evidence entirely for it (although of course, that's not evidence of absence; maybe I'll have to eat my words 5 releases down the line, but appealing to that now feels like a Russell's Teapot), would you care to explain why?

show 1 reply
runarbergyesterday at 8:24 PM

The interpretations of the p-value is also alarming. One of the first thing they teach you in statistics class is: “an absence of evidence is not evidence of absence”.

This analysis showed that there is indeed an absence of evidence, but it concludes there is evidence of absence.

Traditional p-hacking is done by oversampling and overtesting. If you do 20 analysis on average one will show p < 0.05 by random chance. This analysis is doing the inverse of that. Under-sampling, and concluding with p > 0.05

show 2 replies