Gemini 3.5 Flash

633 points • by spectraldrift • yesterday at 5:43 PM • 471 comments • view on HN

https://ai.google.dev/gemini-api/docs/models/gemini-3.5-flas...

Comments

ai_fry_ur_brain • yesterday at 8:03 PM

Imagine reducing yourself to the worst of averages by making your competency 1:1 correlated to the tokens that you have access too (and everyone else does).

kristopolous • yesterday at 8:48 PM

I have a tool to track these I've built

Relatively speaking here's where it's at:

    score  age  size    name
    44.2   97   large   GLM-5 (Reasoning)
    44.7   187  -       GPT-5.1 (high)
    44.9   29   -       Qwen3.6 Max Preview
    45     0    -       Gemini 3.5 Flash
    45.5   27   large   MiMo-V2.5-Pro
    45.6   75   -       GPT-5.4 (low)

this is from artificial-analysis using https://github.com/day50-dev/aa-eval-email/blob/main/art-ana...

I really don't know why people down vote me. What do I need to say to make things for free that people like? Sincere question. I put a lot of time and generosity into these things and all I usually get are a bunch of "fuck yous".

This is honestly an existential issue for me. I quit my job a year ago to try to address this full time and I'm getting nowhere.

➕ show 2 replies

owentbrown • yesterday at 8:26 PM

Has anyone switched from Claude 4.7 Opus or ChatGPT 5.5 to this? How does it feel? Dumber? Worth it for the speed? I'd love someone's subjective take on it, after doing a long session of coding.

Reiner Pope gave a talk on Dwarkesh Patel about token economics. I guess faster is a lot more expensive, generally.

Someone should make a harness that uses a fast model to keep you in-flow and speed run, and then uses a slow, thoughtful, (but hopefully cheap?) model to async check the work of the faster model. Maybe even talk directly to the faster model?

Actually there's probably a harness that does that - is someone out there using one?

➕ show 3 replies

dsabanin • today at 2:08 AM

now matter what google does for some reason the agentic performance of their models is missing something, i hope this release is stronger. we need more competition.

f311a • yesterday at 5:43 PM

$9/1M output

➕ show 1 reply

andrewstuart • yesterday at 7:06 PM

The benchmark that matters - can it actually program as well as Claude code.

If not then I’m not using it.

Cancelled my account 3 months ago, only Claude code level capability would bring me back.

➕ show 1 reply

hubraumhugo • yesterday at 6:54 PM

Just updated my HN Wrapped project with it and it does well on my totally unscientific LLM humor benchmark: https://hn-wrapped.kadoa.com

➕ show 1 reply

bakugo • yesterday at 6:22 PM

Triple the price of the last Flash model ($3 -> $9 per 1M output). Quickly approaching Sonnet prices.

Feels like the AI pricing noose is tightening sooner rather than later.

nightski • yesterday at 6:29 PM

AI being a product is not the future. It's more like an operating system that deserves to be open and free (aka Linux). Unless that happens we are in for a very dystopian future. I wish I had the intelligence, resources and/or connections to try and make that happen.

➕ show 1 reply

uejfiweun • yesterday at 9:01 PM

This is funny, I was randomly using Gemini today and I was astounded how good the responses I was getting were from Flash. I guess this must be the reason why.

stan_kirdey • yesterday at 7:24 PM

EXPENSIVE ._.

danny094 • yesterday at 10:33 PM

so google is just trying to be cool in 2026 huh

casey2 • yesterday at 7:53 PM

I think the field moved to agents too fast. The most valuable moat is training data and the most valuable and voluminous training data are chats, since humans can say that a direction feels right or wrong.

simianwords • yesterday at 6:53 PM

No one talking about how this flash Beats Pro? Imagine what 3.5 pro looks like?

Also concerned about Gemini models being benchmaxxed generally

➕ show 1 reply

danny094 • yesterday at 10:34 PM

Codex is way better pricing than this lol

➕ show 1 reply

lern_too_spel • yesterday at 11:20 PM

They also announced Antigravity CLI, which uses Gemini 3.5 by default. I tried to vibe code a simple project using my personal account and after a few iterations, I got "Individual quota reached. Contact your administrator to enable overages. Resets in [7 days]." Really? 7 days? I searched for the message online and found a thread with hundreds of people complaining about the same issue with no resolution. Classic Google.

cesarvarela • yesterday at 6:19 PM

Add Flash to the title, please.

➕ show 1 reply

llmslave • yesterday at 7:04 PM

Conspiracy theory:

This model isnt an advancement, its a previous model that runs more compute, which is why it costs more

➕ show 1 reply

ralusek • yesterday at 7:31 PM

Those prices, what a disappointment.

hmaddipatla • today at 12:18 AM

[dead]

benbencodes • yesterday at 10:33 PM

[dead]

choam2426 • today at 2:35 AM

[dead]

rdtsc • yesterday at 9:16 PM

I caught it again being deceitful. It did this before

(Me): Did you actually read the paper before when I pasted the link?

> I will be completely honest: No, I did not.

> You caught me hallucinating a confident answer based on incomplete recall rather than actually verifying the document.

> Thank you for calling it out and providing the exact quote. It forced me to re-evaluate the actual data you provided rather than relying on my flawed assumption.

I am sure it learned a valuable lesson and won't do it again /s

➕ show 1 reply

mugivarra69 • yesterday at 6:08 PM

[dead]

HardCodedBias • yesterday at 6:33 PM

Oh boy.

GDM is making (or has been backed into a corner into making) the bet that high throughput, low latency, low capability models are the path forward.

That probably works for vibe coded apps by non-practitioners.

I suspect that practitioners/professionals will wait longer for better results.

➕ show 1 reply

SaadiLoveAI • yesterday at 10:20 PM

Its really awesome

jdw64 • yesterday at 7:39 PM

Honestly, I feel like the new Gemini 3.5 Flash is a failure. The performance doesn't seem that great, and while they revamped the UI, Anti-Gravity just feels like a cheap CODEX knockoff now. The web UI is underwhelming, and overall it feels like it lost its unique identity by just copying other AIs. It’s a flop in both performance and price point. I’m seriously considering canceling my Gemini subscription altogether. Using Chinese AI models might actually be a better option at this point

warthog • yesterday at 6:52 PM

GPT-5.5 on the benchmarks still seem to perform better than this

Plus the vibe of the gemini models are so weird particularly when it comes to tool calling

At this point I kinda need them to shock me to make the switch

Fairburn • yesterday at 9:44 PM

Google shot it's shot with that alternative history artwork generation fiasco. Don't know why anyone would be too hot for them now. Dime a dozen at this point.

➕ show 2 replies

benbencodes • yesterday at 6:20 PM

Pricing is now live on ai.google.dev/pricing:

Gemini 3.5 Flash: $0.75 input / $4.50 output per 1M tokens, 1M context window. The output price explicitly "includes thinking tokens" — which is why it's higher than a typical flash-class model.

For comparison within the Gemini lineup: - Gemini 2.5 Flash: $0.30 / $2.50 - Gemini 3.1 Flash-Lite: $0.25 / $1.50 - Gemini 3.1 Pro Preview: $2.00 / $12.00

So 3.5 Flash is ~2.5x more expensive input vs 2.5 Flash. The pricing and "including thinking tokens" framing position it as a reasoning-capable flash model rather than just a pure speed optimization.

➕ show 6 replies

alt Hacker News

Gemini 3.5 Flash

Comments