Do you mean Flash and not Pro? I haven't tried it personally, but according to OpenRouter, the fastest DeekSeep V4 Pro providers are only ~50tps. That's slower than Claude Opus.
https://openrouter.ai/deepseek/deepseek-v4-pro?sort=throughp...
No, I mean Pro. I use it through OpenCode Go so I don't know what provider it uses under the hood, but it's very fast in my experience.
I don't think token speed matters as much when a lot of tokens are needed to achieve a task. E.g. artificial analysis benchmarks where deepseek v4 is one of the biggest token burners to go through the benchmark.
Yeah, flash is crazy fast, but I've found performance variable.
[flagged]
In recent benchmarking I've been doing, DeepSeek V4 Pro was the fastest of 21 models, by a comfortable margin (https://swelljoe.com/html/bench-report-final.html). Faster than Claude Opus 4.8, which was the second fastest (Mistral doesn't count because it seems to have refused to participate). But, it's a limited data set, just a few benchmark runs of a limited set of tasks. It's entirely possible I happened to be calling the API at its least busy time and maybe Claude got hit during a busy time.