The reasons are functional in aggregate, but not necessarily per specific PR.
You could get a perfectly adequate instance of a PR that is easily readable and verifiable while generated by an LLM, but generally they're not.
A policy pushes the aggregate to at least what looks and communicates as a human made PR that is functionally easier to approve. Whether they are created by an LLM or not is then secondary, but it likely pushes all PRs to be better.
Good point. Even if you submit small, verifiable and readable changes, you can still overload the review process by submitting too many of them (e.g., 100s of PRs).
But I'd argue that some projects [1] could benefit from the speed (and sometimes, quality) of AI code generation without filtering by something that's difficult to identify (i.e., is it truly human-generated).
One way could be to constrain the size of each commit and PR, and invest more heavily into the review process (e.g., tests, static/dynamic analysis, sandbox deployments), so even if you get 100s of contributions, you can knock each out quickly.
Obviously, easier said than done. And at that point, you may as well use the AI to make the commits yourself, instead of relying on community contributions.
[1] Of course, this is only the case if the project's only purpose is to be a tool, and not also an educational reason for humans to learn how to code - in which case, it makes sense to invest more into identifying the "cheaters".