A Raspberry Pi 4 can manage something like 70Mbps of raw AES en/decryption flow: https://github.com/lelegard/aesbench/blob/main/RESULTS.txt
That CPU is pretty much a toy compared to (say) a brand-new M5 or EPYC chip, but it similarly eclipses almost any MCU you can buy.
Even with fast AES acceleration on the CPU/MCU — which I think some Cortex MCUs have — you’re really going to struggles to get much over 100Mbits of encrypted traffic handling, and that’s before the I/O handling interrupts take over the whole chip to shuttle packets on and off the wire.
Modern crypto is cheap for what you get, but it’s still a lot of extra math in the mix when you’re trying to pump bytes in and out of a constrained device.
You're looking at the wrong thing, WireGuard doesn't use AES, it uses ChaCha20. AES is really, really painful to implement securely in software only, and the result performs poorly. But ChaCha only uses addition rotation and XOR with 32 bit numbers and that makes it pretty performant even on fairly computationally limited devices.
For reference, I have an implementation of ChaCha20 running on the RP2350 at 100MBit/s on a single core at 150Mhz (910/64 = ~14.22 cycles per bytes). That's a lot for a cheap microcontroller costing around 1.5 bucks total. And that's not even taking into account using the other core the RP2350 has, or overclocking (runs fine at 300Mhz also at double the speed).