Those second-level reviewers, checking whether the first-level authors used LLMs in their reviews, also used LLMs to do their screening, and the latter missed it in many cases.
My original point (loosely based on the subject, not TFA) is that it's LLMs all the way down, way more than it's "measured" to be.