Gemini 3.1 Pro

309 points • by MallocVoidstar • today at 3:19 PM • 562 comments • view on HN

Preview: https://console.cloud.google.com/vertex-ai/publishers/google...

Card: https://deepmind.google/models/model-cards/gemini-3-1-pro/

Comments

jeffybefffy519 • today at 7:40 PM

Someone needs to make an actual good benchmark for LLM's that matches real world expectations, theres more to benchmarks than accuracy against a dataset.

➕ show 2 replies

onlyrealcuzzo • today at 5:02 PM

We've gone from yearly releases to quarterly releases.

If the pace of releases continues to accelerate - by mid 2027 or 2028 we're headed to weekly releases.

➕ show 1 reply

azuanrb • today at 5:22 PM

The CLI needs work, or they should officially allow third-party harnesses. Right now, the CLI experience is noticeably behind other SOTA models. It actually works much better when paired with Opencode.

But with accounts reportedly being banned over ToS issues, similar to Claude Code, it feels risky to rely on it in a serious workflow.

syspec • today at 6:29 PM

Does anyone know if this is in GA immediately or if it is in preview?

On our end, Gemini 3.0 Preview was very flakey (not model quality, but as in the API responses sometimes errored out), making it unreliable.

Does this mean that 3.0 is now GA at least?

zokier • today at 4:57 PM

> Last week, we released a major update to Gemini 3 Deep Think to solve modern challenges across science, research and engineering. Today, we’re releasing the upgraded core intelligence that makes those breakthroughs possible: Gemini 3.1 Pro.

So this is same but not same as Gemini 3 Deep Think? Keeping track of these different releases is getting pretty ridiculous.

➕ show 2 replies

mark_l_watson • today at 4:17 PM

Fine, I guess. The only commercial API I use to any great extent is gemini-3-flash-preview: cheap, fast, great for tool use and with agentic libraries. The 3.1-pro-preview is great, I suppose, for people who need it.

Off topic, but I like to run small models on my own hardware, and some small models are now very good for tool use and with agentic libraries - it just takes a little more work to get good results.

➕ show 3 replies

mixel • today at 4:53 PM

Google seems to really pull ahead in this AI race. For me personally they offer the best deal and although the software is not quiet there compared to openai or anthropic (in regards to 1. web GUI, 2. agent-cli). I hope they can fix that in the future and I think once Gemini 4 or whatever launches we will see a huge leap again

➕ show 2 replies

hsaliak • today at 4:49 PM

The eventual nerfing gives me pause. Flash is awesome. What we really want is gemini-3.1-flash :)

makeavish • today at 4:42 PM

Great model until it gets nerfed. I wish they had a higher paid tier to use non nerfed model.

➕ show 3 replies

clhodapp • today at 4:15 PM

There's a very short blog post up: https://blog.google/innovation-and-ai/models-and-research/ge...

quacky_batak • today at 4:44 PM

I’m keen to know how and where are you using Gemini.

Anthropic is clearly targeted to developers and OpenAI is general go to AI model. Who are the target demographic for Gemini models? ik that they are good and Flash is super impressive. but i’m curious

➕ show 13 replies

jdthedisciple • today at 9:04 PM

Why should I be excited?

denysvitali • today at 4:33 PM

Where is Simon's pelican?

➕ show 3 replies

__jl__ • today at 4:32 PM

Another preview release. Does that mean the recommended model by Google for production is 2.5 Flash and Pro? Not talking about what people are actually doing but the google recommendation. Kind of crazy if that is the case

yuvalmer • today at 6:44 PM

Gemini 3.0 Pro is bad model for its class. I really hope 3.1 is a leap forward.

seizethecheese • today at 5:08 PM

I use Gemini flash lite in a side project, and it’s stuck on 2.5. It’s now well behind schedule. Any speculation as to what’s going on?

➕ show 1 reply

kuprel • today at 7:43 PM

Why don't they show Grok benchmarks?

➕ show 1 reply

eric15342335 • today at 4:47 PM

My first impression is that the model sounds slightly more human and a little more praising. Still comparing the ability.

1024core • today at 5:29 PM

It's been hugged to death. I keep getting "Something went wrong".

matrix2596 • today at 4:29 PM

Gemini 3.1 Pro is based on Gemini 3 Pro

➕ show 1 reply

msavara • today at 4:33 PM

Somehow doesn't work for me :) "An internal error has occurred"

trilogic • today at 6:10 PM

Humanity last exam 44%, Scicode 59, and that 80, and this 78 but not 100% ever.

Would be nice to see that this models, Plus, Pro, Super, God mode can do 1 Bench 100%. I am missing smth here?

PunchTornado • today at 4:23 PM

The biggest increase is LiveCodeBench Pro: 2887. The rest are in line with Opus 4.6 or slightly better or slightly worse.

➕ show 1 reply

naiv • today at 4:43 PM

ok , so they are scared that 5.3 (pro) will be released today/tomorrow and blow it out of the water and rushed it while they could still reference 5.2 benchmarks.

➕ show 1 reply

Topfi • today at 4:12 PM

Appears the only difference to 3.0 Pro Preview is Medium reasoning. Model naming has long gone from even trying to make sense, but considering 3.0 is still in preview itself, increasing the number for such a minor change is not a move in the right direction.

➕ show 4 replies

LZ_Khan • today at 4:57 PM

biggest problem is that it's slow. also safety seems overtuned at the moment. getting some really silly refusals. everything else is pretty good.

makeavish • today at 4:39 PM

I hope to have great next two weeks before it gets nerfed.

➕ show 1 reply

mustaphah • today at 4:42 PM

Google is terrible at marketing, but this feels like a big step forward.

As per the announcement, Gemini 3.1 Pro score 68.5% on Terminal-Bench 2.0, which makes it the top performer on the Terminus 2 harness [1]. That harness is a "neutral agent scaffold," built by researchers at Terminal-Bench to compare different LLMs in the same standardized setup (same tools, prompts, etc.).

It's also taken top model place on both the Intelligence Index & Coding Index of Artificial Analysis [2], but on their Agentic Index, it's still lagging behind Opus 4.6, GLM-5, Sonnet 4.6, and GPT-5.2.

---

[1] https://www.tbench.ai/leaderboard/terminal-bench/2.0?agents=...

[2] https://artificialanalysis.ai

➕ show 1 reply

BMFXX • today at 7:04 PM

Just wish iI could get 2.5 daily limit above 1000 requests easily. Driving me insane...

ChrisArchitect • today at 6:21 PM

More discussion: https://news.ycombinator.com/item?id=47075318

lysecret • today at 6:37 PM

Please I need 3 in ga…

nautilus12 • today at 5:11 PM

Ok, why don't you work on getting 3.0 out of preview first? 10 min response time is pretty heinous

➕ show 1 reply

jeffbee • today at 4:58 PM

Relatedly, Gemini chat seems to be if not down then extremely slow.

ETA: They apparently wiped out everyone's chats (including mine). "Our engineering team has identified a background process that was causing the missing user conversation metadata and has successfully stopped the process to prevent further impact." El Mao.

sergiotapia • today at 4:57 PM

To use in OpenCode, you can update the models it has:

    opencode models --refresh

Then /models and choose Gemini 3.1 Pro

You can use the model through OpenCode Zen right away and avoid that Google UI craziness.

---

It is quite pricey! Good speed and nailed all my tasks so far. For example:

    @app-api/app/controllers/api/availability_controller.rb 
    @.claude/skills/healthie/SKILL.md 

    Find Alex's id, and add him to the block list, leave a comment 
    that he has churned and left the company. we can't disable him 
    properly on the Healthie EMR for now so 
    this dumb block will be added as a quick fix.

Result was:

    29,392 tokens
    $0.27 spent

So relatively small task, hitting an API, using one of my skills, but a quarter. Pricey!

➕ show 1 reply

cmrdporcupine • today at 4:55 PM

Doesn't show as available in gemini CLI for me. I have one of those "AI Pro" packages, but don't see it. Typical for Google, completely unclear how to actually use their stuff.

dude250711 • today at 4:28 PM

I hereby allow you to release models not at the same time as your competitors.

➕ show 1 reply

himata4113 • today at 6:04 PM

The visual capabilities of this model are frankly kind of ridicioulus what the hell.

johnwheeler • today at 5:35 PM

I know Google has anti-gravity but do they have anything like Claude code as far as user interface terminal basically TUI?

➕ show 1 reply

leecommamichael • today at 6:50 PM

Whoa, I think Gemini 3 Pro was a disappointment, but Gemini 3.1 Pro is definitely the future!

ChrisArchitect • today at 4:34 PM

Blog post: https://blog.google/innovation-and-ai/models-and-research/ge...

saberience • today at 4:34 PM

I always try Gemini models when they get updated with their flashy new benchmark scores, but always end up using Claude and Codex again...

I get the impression that Google is focusing on benchmarks but without assessing whether the models are actually improving in practical use-cases.

I.e. they are benchmaxing

Gemini is "in theory" smart, but in practice is much, much worse than Claude and Codex.

➕ show 5 replies

throwaw12 • today at 5:41 PM

Can we switch from Claude Code to Google yet?

Benchmarks are saying: just try

But real world could be different

➕ show 1 reply

pickle-pixel • today at 7:19 PM

does it still crash out after couple prompts?

taytus • today at 8:41 PM

Another preview model? Why google keep doing this?

boxingdog • today at 5:30 PM

[dead]

rohithavale3108 • today at 4:04 PM

[flagged]

Filip_portive • today at 7:22 PM

My new comment

alt Hacker News

Gemini 3.1 Pro

Comments

🔗 View 2 more comments