Don’t let the “flash” name fool you, this is an amazing model.
I have been playing with it for the past few weeks, it’s genuinely my new favorite; it’s so fast and it has such a vast world knowledge that it’s more performant than Claude Opus 4.5 or GPT 5.2 extra high, for a fraction (basically order of magnitude less!!) of the inference time and price
I wonder at what point will everyone who over-invested in OpenAI will regret their decision (expect maybe Nvidia?). Maybe Microsoft doesn't need to care, they get to sell their models via Azure.
Alright so we have more benchmarks including hallucinations and flash doesn't do well with that, though generally it beats gemini 3 pro and GPT 5.1 thinking and gpt 5.2 thinking xhigh (but then, sonnet, grok, opus, gemini and 5.1 beat 5.2 xhigh) - everything. Crazy.
Thanks, having it walk a hardcore SDR signal chain right now --- oh damn it just finished. The blog post makes it clear this isn't just some 'lite' model - you get low latency and cognitive performance. really appreciate you amplifying that.
What are you using it for and what were you using before?
How good is it for coding, relative to recent frontier models like GPT 5.x, Sonnet 4.x, etc?
Can you be more specific on the tasks you’ve found exceptional ?
Gemini 2.0 flash was good already for some tasks of mine long time ago..
Cool! I've been using 2.5 flash and it is pretty bad. 1 out of 5 answers it gives will be a lie. Hopefully 3 is better
I think google is the only one that still produce general knowledge LLM right now
claude is coding model from the start but GPT is in more and more becoming coding model
How did you get early access?
Just to point this out: many of these frontier models cost isn't that far away from two orders of magnitude more than what DeepSeek charges. It doesn't compare the same, no, but with coaxing I find it to be a pretty capable competent coding model & capable of answering a lot of general queries pretty satisfactorily (but if it's a short session, why economize?). $0.28/m in, $0.42/m out. Opus 4.5 is $5/$25 (17x/60x).
I've been playing around with other models recently (Kimi, GPT Codex, Qwen, others) to try to better appreciate the difference. I knew there was a big price difference, but watching myself feeding dollars into the machine rather than nickles has also founded in me quite the reverse appreciation too.
I only assume "if you're not getting charged, you are the product" has to be somewhat in play here. But when working on open source code, I don't mind.
[dead]
[flagged]
Oh wow - I recently tried 3 Pro preview and it was too slow for me.
After reading your comment I ran my product benchmark against 2.5 flash, 2.5 pro and 3.0 flash.
The results are better AND the response times have stayed the same. What an insane gain - especially considering the price compared to 2.5 Pro. I'm about to get much better results for 1/3rd of the price. Not sure what magic Google did here, but would love to hear a more technical deep dive comparing what they do different in Pro and Flash models to achieve such a performance.
Also wondering, how did you get early access? I'm using the Gemini API quite a lot and have a quite nice internal benchmark suite for it, so would love to toy with the new ones as they come out.