logoalt Hacker News

thaumasiotestoday at 1:05 AM2 repliesview on HN

> I gave the feedback at one Google interview that they should send Google employees through to see how many get hired. Good to see they basically tried that.

They did, but not with the intention of doing anything about the problem.

This is a question of reliability, the conceptual 'correlation' of a measurement instrument with itself when measuring the same thing.

Reliability is one of two major concepts in psychometrics, the other being validity, the conceptual correlation between a measurement instrument and that part of reality that you're hoping to measure.

The question behind validity is "I want to know X; if I measure Y, how helpful will that be?". And the question behind reliability is "if I measure Z, how accurate will that measurement be?"

https://en.wikipedia.org/wiki/Reliability_(statistics)

https://en.wikipedia.org/wiki/Construct_validity

Yegge calls out both concepts explicitly, though not by name, in this essay:

>> The outcomes from interviewing are statistically terrible. Google did wave upon wave of analysis over the years, and all the results were incredibly depressing.

>> [reliability] To name just a few off the top of my head: interviewers barely agreed with each other. Put the same candidate in front of two of our sharpest people and you’d routinely get a confident “strong hire” from one and a flat “no” from the other.

>> [validity, though the 'problem' here is strongly confounded by a restriction of range issue] And once people were actually on the job, their interview scores told you next to nothing about how they’d do

>> [reliability] Hell, some of our star performers failed their Google interviews four or five times, finally got in after 2+ years...

>> [validity] ...and then outshone everyone else.

The discussion of how interviewing outcomes are statistically terrible would benefit from naming the ways in which they're statistically terrible. Knowing the problem you have is an important step toward solving it.

(And as a side note, the last I heard from Google, you're not allowed to interview more often than once a year. Interviewing five times in two years would seem to violate that policy.)

It is a basic theorem that the validity of any instrument is bounded above by the square root of the reliability. It isn't possible for an unreliable instrument to be tightly correlated to reality, because it is, by definition, not tightly correlated with anything. That's what it means to be unreliable.

Thus, any company that wanted its hiring process to be good would necessarily be extremely concerned with making that process accurate; you need to come to the same decision when you assess the same person. This is something that interviews cannot achieve except at extreme cost. You'd need far more than five interviews to get a reliable assessment from them, despite the claim in this essay that "any more than four interviews and you're just playin' with your food". Of course, the Google interviews aren't supposed to be reliable anyway, so in that sense the claim is probably accurate.

The prescription Yegge offers is valid. Multi-month work assessments will give you a strong, reliable, and valid signal. They're also very expensive.

Another thing the essay completely glosses over is that this problem has been recognized for a long time, and we already know how to do assessments that are reliable, valid, and cheap to perform. They're called standardized tests.


Replies

Ferret7446today at 1:39 AM

At least historically, Google prioritized not hiring bad candidates over hiring good candidates. So it was neither a priority for interviews to be consistent (for good candidates) or for employees to be able to consistently pass interviews.

syndackstoday at 1:23 AM

Serious question, tell me what you think of using IQ tests to hire SWEs? Should we just do that instead?