logoalt Hacker News

aleccoyesterday at 9:30 PM1 replyview on HN

DeepSeek API gave 6x to 8x better caching rate for inputs over OpenRouter (even chosing DeepSeek as provider). And some of the cheaper providers are using FP4 quantizations.

https://openrouter.ai/deepseek/deepseek-v4-flash-20260423#pr...

After complaints the cached read is not listed anymore in that page, you have to click one by one. All providers for DeepSeek V4 Flash charge ~$0.02 while DeepSeek provider is $0.0028. For coding this is huge as caching often gets in the range of 90 to 99%. But OpenRouter messes your caching so don't use it. And it seems to be a VC-backed closed middle-man company, not open source or open anything.


Replies

ryeguyyesterday at 11:13 PM

Openrouter's pricing via the deepseek provider is the same as the official deepseek api for both flash and pro and for cached and uncached tokens. It's literally the same api.

And no, cache rates are not different if you're going through the official deepseek provider. The only way caching rates can drop is if you let openrouter fully control routing by preferring uptime or something, and then it might bounce you between providers. But you can control which providers for a given model are in its routing pool and stop that.