I expect that, for values of n for which this test consistently reports "LLM-generated" on LLM-generated inputs, it will also consistently report "LLM-generated" on human-generated inputs. But I haven't done the test either so I could be wrong.