So fast mode uses more tokens, in direct opposition to Gemini where fast 'mode' means less. One more piece of useless knowledge to remember.
You're comparing two different things. It's not useless knowledge, it's something you need to understand.
Opus fast mode is routed to different servers with different tuning that prioritizes individual response throughput. Same model served differently. Same response, just delivered faster.
The Gemini fast mode is a different model (most likely) with different levels of thinking applied. Very different response.
I don't think this is the case, according to the docs, right? The effort level will use fewer tokens, but the independent fast mode just somehow seems to use some higher priority infrastructure to serve your requests.