We've gone from yearly releases to quarterly releases.
If the pace of releases continues to accelerate - by mid 2027 or 2028 we're headed to weekly releases.
The CLI needs work, or they should officially allow third-party harnesses. Right now, the CLI experience is noticeably behind other SOTA models. It actually works much better when paired with Opencode.
But with accounts reportedly being banned over ToS issues, similar to Claude Code, it feels risky to rely on it in a serious workflow.
Does anyone know if this is in GA immediately or if it is in preview?
On our end, Gemini 3.0 Preview was very flakey (not model quality, but as in the API responses sometimes errored out), making it unreliable.
Does this mean that 3.0 is now GA at least?
> Last week, we released a major update to Gemini 3 Deep Think to solve modern challenges across science, research and engineering. Today, we’re releasing the upgraded core intelligence that makes those breakthroughs possible: Gemini 3.1 Pro.
So this is same but not same as Gemini 3 Deep Think? Keeping track of these different releases is getting pretty ridiculous.
Fine, I guess. The only commercial API I use to any great extent is gemini-3-flash-preview: cheap, fast, great for tool use and with agentic libraries. The 3.1-pro-preview is great, I suppose, for people who need it.
Off topic, but I like to run small models on my own hardware, and some small models are now very good for tool use and with agentic libraries - it just takes a little more work to get good results.
Google seems to really pull ahead in this AI race. For me personally they offer the best deal and although the software is not quiet there compared to openai or anthropic (in regards to 1. web GUI, 2. agent-cli). I hope they can fix that in the future and I think once Gemini 4 or whatever launches we will see a huge leap again
The eventual nerfing gives me pause. Flash is awesome. What we really want is gemini-3.1-flash :)
Great model until it gets nerfed. I wish they had a higher paid tier to use non nerfed model.
There's a very short blog post up: https://blog.google/innovation-and-ai/models-and-research/ge...
I’m keen to know how and where are you using Gemini.
Anthropic is clearly targeted to developers and OpenAI is general go to AI model. Who are the target demographic for Gemini models? ik that they are good and Flash is super impressive. but i’m curious
Why should I be excited?
Another preview release. Does that mean the recommended model by Google for production is 2.5 Flash and Pro? Not talking about what people are actually doing but the google recommendation. Kind of crazy if that is the case
Gemini 3.0 Pro is bad model for its class. I really hope 3.1 is a leap forward.
I use Gemini flash lite in a side project, and it’s stuck on 2.5. It’s now well behind schedule. Any speculation as to what’s going on?
My first impression is that the model sounds slightly more human and a little more praising. Still comparing the ability.
It's been hugged to death. I keep getting "Something went wrong".
Somehow doesn't work for me :) "An internal error has occurred"
Humanity last exam 44%, Scicode 59, and that 80, and this 78 but not 100% ever.
Would be nice to see that this models, Plus, Pro, Super, God mode can do 1 Bench 100%. I am missing smth here?
The biggest increase is LiveCodeBench Pro: 2887. The rest are in line with Opus 4.6 or slightly better or slightly worse.
ok , so they are scared that 5.3 (pro) will be released today/tomorrow and blow it out of the water and rushed it while they could still reference 5.2 benchmarks.
Appears the only difference to 3.0 Pro Preview is Medium reasoning. Model naming has long gone from even trying to make sense, but considering 3.0 is still in preview itself, increasing the number for such a minor change is not a move in the right direction.
biggest problem is that it's slow. also safety seems overtuned at the moment. getting some really silly refusals. everything else is pretty good.
I hope to have great next two weeks before it gets nerfed.
Google is terrible at marketing, but this feels like a big step forward.
As per the announcement, Gemini 3.1 Pro score 68.5% on Terminal-Bench 2.0, which makes it the top performer on the Terminus 2 harness [1]. That harness is a "neutral agent scaffold," built by researchers at Terminal-Bench to compare different LLMs in the same standardized setup (same tools, prompts, etc.).
It's also taken top model place on both the Intelligence Index & Coding Index of Artificial Analysis [2], but on their Agentic Index, it's still lagging behind Opus 4.6, GLM-5, Sonnet 4.6, and GPT-5.2.
---
[1] https://www.tbench.ai/leaderboard/terminal-bench/2.0?agents=...
Just wish iI could get 2.5 daily limit above 1000 requests easily. Driving me insane...
More discussion: https://news.ycombinator.com/item?id=47075318
Please I need 3 in ga…
Ok, why don't you work on getting 3.0 out of preview first? 10 min response time is pretty heinous
Relatedly, Gemini chat seems to be if not down then extremely slow.
ETA: They apparently wiped out everyone's chats (including mine). "Our engineering team has identified a background process that was causing the missing user conversation metadata and has successfully stopped the process to prevent further impact." El Mao.
To use in OpenCode, you can update the models it has:
opencode models --refresh
Then /models and choose Gemini 3.1 ProYou can use the model through OpenCode Zen right away and avoid that Google UI craziness.
---
It is quite pricey! Good speed and nailed all my tasks so far. For example:
@app-api/app/controllers/api/availability_controller.rb
@.claude/skills/healthie/SKILL.md
Find Alex's id, and add him to the block list, leave a comment
that he has churned and left the company. we can't disable him
properly on the Healthie EMR for now so
this dumb block will be added as a quick fix.
Result was: 29,392 tokens
$0.27 spent
So relatively small task, hitting an API, using one of my skills, but a quarter. Pricey!Doesn't show as available in gemini CLI for me. I have one of those "AI Pro" packages, but don't see it. Typical for Google, completely unclear how to actually use their stuff.
I hereby allow you to release models not at the same time as your competitors.
The visual capabilities of this model are frankly kind of ridicioulus what the hell.
I know Google has anti-gravity but do they have anything like Claude code as far as user interface terminal basically TUI?
Whoa, I think Gemini 3 Pro was a disappointment, but Gemini 3.1 Pro is definitely the future!
I always try Gemini models when they get updated with their flashy new benchmark scores, but always end up using Claude and Codex again...
I get the impression that Google is focusing on benchmarks but without assessing whether the models are actually improving in practical use-cases.
I.e. they are benchmaxing
Gemini is "in theory" smart, but in practice is much, much worse than Claude and Codex.
Can we switch from Claude Code to Google yet?
Benchmarks are saying: just try
But real world could be different
does it still crash out after couple prompts?
Another preview model? Why google keep doing this?
[dead]
[flagged]
My new comment
Someone needs to make an actual good benchmark for LLM's that matches real world expectations, theres more to benchmarks than accuracy against a dataset.