Thousands of applicants reaching the substantial work stage is a failure of the systems thinking you're talking about. Hundreds of resumes nearly always gets narrowed down to perhaps a dozen or two at most at the screening stage.
And I would make it very clear that putting in more than 30 minutes of work, timed, is a disqualifier, and I would sleep well at night clearing all those people out of the queue.
Hundreds of good applicants can’t be whittled down to a dozen without being very picky about things in the resume which may just be a poor representation.
You will bias heavily along some kind of axis, preferred previous employers or location, age, etc.
You add a lot of bias into the system by trying to further scrutinise otherwise meaningfully qualified people on paper.