This reads as incredibly damning to me. PR throughput should be a metric that is very supportive of the AI productivity narrative, but the effect is marginal.
Before everyone gets at me: smoking cigarettes increases your risk of lung cancer by 15-30x. Effect size matters. As does margin of error: what is the margin of error? This "increase" could easily be within noise.
PR throughput is also not a metric I would ever use to determine developer productivity for a paradigm shifting technology. I would only ever use it to compare like-to-like to find trailheads: is a team or person suddenly way more or less productive? The primary endpoint for software production is serving your customer or your mission, and PR throughput can't tell you whether any of that got better. It also cannot tell you the cost of your prior work: the increase in PR throughput could be more PRs to fix issues introduced by LLM-assisted work.
I suspect the issue is the SDLC methodology of existing mature products. The "I can build it in a weekend" use case has gotten a massive boost as you can build something which "looks" real faster then ever. Mature teams need to deal with backwards compatibility and real development risk.