logoalt Hacker News

niobeyesterday at 9:23 PM2 repliesview on HN

So fast mode uses more tokens, in direct opposition to Gemini where fast 'mode' means less. One more piece of useless knowledge to remember.


Replies

Sol-yesterday at 9:41 PM

I don't think this is the case, according to the docs, right? The effort level will use fewer tokens, but the independent fast mode just somehow seems to use some higher priority infrastructure to serve your requests.

Aurornisyesterday at 10:25 PM

You're comparing two different things. It's not useless knowledge, it's something you need to understand.

Opus fast mode is routed to different servers with different tuning that prioritizes individual response throughput. Same model served differently. Same response, just delivered faster.

The Gemini fast mode is a different model (most likely) with different levels of thinking applied. Very different response.