That seems somewhat similar to perplexity based detection, although you can just get the probabilities of each token instead of picking n-best, and you don't have to generate.
It kinda works, but is not very reliable and is quite sensitive to which model the text was generated with.
This page has nice explanations:
https://www.pangram.com/blog/why-perplexity-and-burstiness-f...