Here are some benchmarks, TLDR is Anubis is not as performant as an optimized client prover running on the same HEDT CPU.
So the "PoW tax" essentially only applies to low volume requester who have no incentive to optimize or bespoke solution too diverse to optimize at scale.
https://yumechi.jp/en/blog/2025/proof-of-mutex-outspeeding-a...
https://github.com/eternal-flame-AD/pow-buster
The problem was "fixed" but then reverted because the fix has deadlock bug. (Changelog entry: "Remove bbolt actorify implementation due to causing production issues.")