Slight increase in model cost, but looks like benefits across the board to match.
gpt-5.2 $1.75 $0.175 $14.00
gpt-5.1 $1.25 $0.125 $10.00Plus users are now defaulted to a faster, less deep GPT-5.2 Thinking mode called “Standard”, and you now have to manually select “Extended” to get back to previous deep thinking level for Plus users. Yet the 3K messages a week quota is the same regardless of thinking level. Also, the selection does not sync to mobile (you know, just not enough RAM in computers these days to persist a setting between web and mobile).
So, right off the bat: 5.2 code talk (through codex) feels really nice. The first coding attempt was a little meh compared to 5.1 codex max (reflecting what they wrote themselves), but simply planning / discussing things felt markedly better than anything I remember from any previous model, from any company.
I remain excited about new models. It's like finding my coworker be 10% smarter every other week.
> it’s better at creating spreadsheets
I have a bad feeling about this.
> Additionally, on our internal benchmark of junior investment banking analyst spreadsheet modeling tasks—such as putting together a three-statement model for a Fortune 500 company with proper formatting and citations, or building a leveraged buyout model for a take-private—GPT 5.2 Thinking's average score per task is 9.3% higher than GPT‑5.1’s, rising from 59.1% to 68.4%.
Confirming prior reporting about them hiring junior analysts
The ARC AGI 2 bump to 52.9% is huge. Shockingly GPT 5.2 Pro does not add too much more (54.2%) for the increase cost.
This is also the exact on-the-day 10th anniversary of openai's creation incidentally
Does it still use the word ‘fluff’ in 90% of its preambles, or is it finally able to get straight to the point?
Is the training cutoff date known?
Discussion on blog post: https://openai.com/index/introducing-gpt-5-2/ (https://news.ycombinator.com/item?id=46234874)
They’re definitely just training the models on the benchmarks at this point
are we doomed yet?
Seems not yet with 5.2
So how much better is it than opus or Gemini ?
Marginal gains for exorbitantly pricey and closed model…..
OpenAI is really good at just saying stuff on the internet.
I love the way they talk about incorrect responses:
> Errors were detected by other models, which may make errors themselves. Claim-level error rates are far lower than response-level error rates, as most responses contain many claims.
“These numbers might be wrong because they were made up by other models, which we will not elaborate on, also these numbers are much higher by a metric that reflects how people use the product, which we will not be sharing“
I also really love the graph where they drew a line at “wrong half of the time” and labeled it ‘Expert-Level’.
10/10, reading this post is experientially identical to watching that 12 hours of jingling keys video, which is hard to pull off for a blog.
Did Calmmy Sammy that his is the version that will finally cure cancer? The AI shakeout in the AI industry is going to be brutal. Can't see how Private Equity is going to get the little guy to be left holding the giant bag of excrement, but they will figure that out. AI, smart enough to replace you, but not quite smart enough the replace the CEO or Hedge Fund Bros.
Still 256K input tokens. So disappointing (predictable, but disappointing).
im happy for this, but there's all these math and science benchmarks, has anyone ever made a communicates-like-a-human benchmark? or an isn't-frustrating-to-talk-with benchmark?
[dead]
I have already cancelled. Claude is more than enough for me. I don’t see any point in splitting hairs. They are all going to keep lying more and more sneakily.
“…where it outperforms industry professionals at well-specified knowledge work tasks spanning 44 occupations.”
What a sociopathic way to sell
"Investors are putting pressure, change the version number now!!!"
Is this another GPT-4.5?
I'm not interested in using OpenAI anymore because Sam Altman is so untrustworthy. All you see on X.com is him and Greg Brockman kissing David Sacks' ass, trying to make inroads with him, asking Disney for investments, and shit. Are you kidding? Who wants to support these clowns? Let's let Google win. Let's let Anthropic win. Anyone but Sam Altman.
They just keep flogging that dead horse.
The winner in this race will be whoever gets small local models to perform as well on consumer hardware. It'll also pop the tech bubble in the US.
$168.00 / 1M ouput tokens is hilarious for their "Pro". Can't wait to here all the bitching from orgs next month. Literally the dumbest product of all time. Do you people seriously pay for this?
GPT-5.2 System Card PDF: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...
[dead]
[dead]
[dead]
[dead]
[dead]
[flagged]
I told all my friends to upgrade or they're not my friends anymore /s
No, thank you, OpenAI and ChatGPT doesn't cut it for me.
[flagged]
No, thank you, OpenAI and ChatGPT doesn't cut it for me.
The thing about OpenAI is their models never fit anywhere for me. Yes they maybe smart or even the smartest models but they are alway so fucking slow. The ChatGPT web app is literally usable for me. I ask simple task and it does most extreme shit jsut to get an answer that the same as Claude or Gemini.
For example, I asked ChatGPT to take a chart and convert into a table. It went and cut up the image and zoomed in for literally 5 mins to get the a worst answer than Claude which did it in under a minute.
I see people talk about Codex like it better than Claude Code, and I go and try it and it takes a lifetime to do thing and it return maybe an on par result as Opus or Sonnet but it takes 5mins longer.
I just tried out this model and it the same exact thing. It just take ages for it to give you an answer.
I don't get how these models are useful in the real world.
What am I missing, is this just me?
I guess it truly an enterprise model.
It baffles me to see these last 2 announcements (GPT 5.1 as well) devoid of any metrics, benchmarks or quantitative analyses. Could it be because they are behind Google/Anthropic and they don't want to admit it?
(edit: I'm sorry I didn't read enough on the topic, my apologies)
I feel like if we're going to regulate anything about AI, we should start by regulating (1) what they get to claim to be a "new model" to the public and (2) what changes they are allowed to make at inference before being forced to name it something different.
This shift toward new platforms is exactly why I’m building Truwol, a social experience focused on real, unedited human moments instead of the AI-saturated feeds we’re drifting toward. I’m developing it independently and sharing the progress publicly, so if you’re interested in projects reinventing online spaces from the ground up, you can see what I’m working on Truwol buymeacoffee/Truwol
gpt-5.2 and gpt-5.2-chat-latest the same token price? Isn't the latter non-thinking and more akin to -nano or -mini?