I mean have a bunch of competent teams that (importantly) didn’t design the algorithm read the final draft and write their versions of it. Then they and others can perform practical analysis on each (empirically look for timing side channels on x86 and ARM, fuzz them, etc.).
> If instead you mean "figure out after some period of implementation whether the standard itself is good", I don't know how that's meant to be workable.
The forcing function can potentially be: this final draft is the heir apparent. If nothing serious comes up in the next 6 months, it will be summarily finalized.
It’s possible this won’t get any of the implementers off their ass on a reasonable timeframe - this happens with web standards all the time. It’s also possible that this is very unlikely to uncover anything not already uncovered. Like I said, I’m not totally convinced that in this specific field it makes sense. But your arguments against it are fully general against this kind of phased process at all, and I think it has empirically improved recent W3C and IETF standards (including QUIC and HTTP2/3) a lot compared to the previous method.
Again: that has now happened. What have we learned from it that we needed to know 3 years ago when NIST chose Kyber? That's an important question, because this is a whole giant thread about Bernstein's allegation that the IETF is in the pocket of the NSA (see "part 4" of this series for that charming claim).
Further, the people involved in the NIST PQ key establishment competition are a murderers row of serious cryptographers and cryptography engineers. All of them had the knowhow and incentive to write implementations of their constructions and, if it was going to showcase some glaring problem, of their competitors. What makes you think that we lacked implementation understanding during this process?