logoalt Hacker News

GPT Image 1.5

505 pointsby charlierguoyesterday at 6:07 PM241 commentsview on HN

https://platform.openai.com/docs/models/gpt-image-1.5


Comments

vunderbayesterday at 9:40 PM

Okay results are in for GenAI Showdown with the new gpt-image 1.5 model for the editing portions of the site!

https://genai-showdown.specr.net/image-editing

Conclusions

- OpenAI has always had some of the strongest prompt understanding alongside the weakest image fidelity. This update goes some way towards addressing this weakness.

- It's leagues better at making localized edits without altering the entire image's aesthetic than gpt-image-1, doubling the previous score from 4/12 to 8/12 and the only model that legitimately passed the Giraffe prompt.

- It's one of the most steerable models with a 90% compliance rate

Updates to GenAI Showdown

- Added outtakes sections to each model's detailed report in the Text-to-Image category, showcasing notable failures and unexpected behaviors.

- New models have been added including REVE and Flux.2 Dev (a new locally hostable model).

- Finally got around to implementing a weighted scoring mechanism which considers pass/fail, quality, and compliance for a more holistic model evaluation (click pass/fail icon to toggle between scoring methods).

If you just want to compare gpt-image-1, gpt-image-1.5, and NB Pro at the same time:

https://genai-showdown.specr.net/image-editing?models=o4,nbp...

show 15 replies
minimaxiryesterday at 6:19 PM

I have a Nano Banana Pro blog post in the works expanding on my experiments with Nano Banana (https://news.ycombinator.com/item?id=45917875). Running a few of my test cases from that post and the upcoming blog post through this new ChatGPT Image model, this new model is better than Nano Banana but MUCH worse than Nano Banana Pro which now nails the test cases that previously showed issues. The pricing is unclear but gpt-image-1.5 appears to be 20% cheaper than the current gpt-image-1 model, which would put a `high`-quality generation in the same price range as Nano Banana Pro.

One curious case demoed here in the docs is the grid use case. Nano Banana Pro can also generate grids, but for NBP grid adherence to the prompt collapses after going higher than 4x4 (there's only a finite amount of output tokens to correspond to each subimage), so I'm curious that OpenAI started with a 6x6 case albeit the test prompt is not that nuanced.

show 4 replies
oxag3nyesterday at 8:35 PM

If this was a farm of sweatshop Photoshopers in 2010, who download all images from the internet and provide a service of combining them on your request, this would escalate pretty quickly.

Question: with copyright and authorship dead wrt AI, how do I make (at least) new content protected?

Anecdotal: I had a hobby of doing photos in quite rare style and lived in a place where you'd get quite a few pictures of. When I asked gpt to generate a picture of that are in that style, it returned highly modified, but recognizable copy of a photo I've published years ago.

show 7 replies
agentifyshyesterday at 9:14 PM

I am very impressed a benchmark I like to run is have it create sprite maps, uv texture maps for an imagined 3d model

Noticed it captured a megaman legends vibe ....

https://x.com/AgentifySH/status/2001037332770615302

and here it generated a texture map from a 3d character

https://x.com/AgentifySH/status/2001038516067672390/photo/1

however im not sure if these are true uv maps that is accurate as i dont have the 3d models itself

but ive tried this in nano banana when it first came out and it couldn't do it

show 2 replies
blurbleblurbleyesterday at 8:30 PM

It's really weird to see "make images from memories that aren't real" as a product pitch

show 4 replies
yuni_aigctoday at 6:35 PM

One thing I’ve noticed when comparing these models is that “quality” and “realism” don’t always move together.

Some models are very strong at sharp details and localized edits, but they can break global lighting consistency — shadows, reflections, or overall scene illumination drift in subtle ways. GPT-Image seems to trade a bit of micro-detail for better global coherence, especially in lighting, which makes composites feel more believable even if they’re not pixel-perfect.

It’s hard to capture this in benchmarks, but for real-world editing workflows it ends up mattering more than I initially expected.

encroachtoday at 2:33 AM

This outperforms Gemini 3 pro image (nano banana pro) on Text-to-Image Arena and Image Edit Arena. I'm surprised they didn't mention this leaderboard in the blog post.

I like this benchmark because its based upon user votes, so overfitting is not as easy (after all, if users prefer your result, you've won).

https://lmarena.ai/leaderboard/text-to-image

https://lmarena.ai/leaderboard/image-edit

show 2 replies
sharkjacobsyesterday at 7:37 PM

Was it ever explained or understood why ChatGPT Images always has (had?) that yellow cast?

show 11 replies
mingabungayesterday at 9:49 PM

Did an experiment to give a software product a dark theme. Gave Both (GPT and Gemini/Nano) a screenshot of the product and an example theme I found on Dribbble.

- Gemini/Nano did a pretty average job, only applying some grey to some of the panels. I tried a few different examples and got similar output.

- GPT did a great job and themed the whole app and made it look great. I think I'd still need a designer to finesse some things though.

abbycurtis33yesterday at 6:16 PM

I still use Midjourney, because all of these major players are so bad at stylistic and creative work. They're singularly focused on photorealism.

show 6 replies
rw2today at 8:18 AM

Having used it compared to Nano Banana:

-The latency is still too high, lower than 10 seconds for nano banana and around 25 seconds for GPT image 1.5

-The quality is higher but not a jump like previous google models to Nano Banana Pro. Nano banana pro is still at least equivalently good or better in my opinion.

password-apptoday at 12:27 AM

Impressive image quality improvements. Meanwhile, AI agents just crossed a milestone: Simular's Agent S hit 72.6% on OSWorld (human-level is 72.36%).

We're seeing AI get better at both creative tasks (images) and operational tasks (clicking through websites).

For anyone building AI agents: the security model is still the hard part. Prompt injection remains unsolved even with dedicated security LLMs.

chakintoshtoday at 8:32 AM

Can't wait to generate fake memories with my 20 years ago dead grandma

anonfunctionyesterday at 9:11 PM

So the announcement said the API works with the new model, so I updated my Golang SDK grail (https://github.com/montanaflynn/grail) to use but it returns a 500 server error when you try to use it, and if you change to a completely unknown model it's not listed in the available models:

  POST "https://api.openai.com/v1/responses": 500 Internal Server Error {
    "message": "An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_******************* in your message.",
    "type": "server_error",
    "param": null,
    "code": "server_error"
  }

  POST "https://api.openai.com/v1/responses": 400 Bad Request {
    "message": "Invalid value: 'blah'. Supported values are: 'gpt-image-1' and 'gpt-image-1-mini'.",
    "type": "invalid_request_error",
    "param": "tools[0].model",
    "code": "invalid_value"
  }
focktoday at 1:41 PM

Good to see that hands are still not solved...

aziis98yesterday at 9:10 PM

I know this is a bit out of scope for these image editing models but I always try this experiment [1] of drawing a "random" triangle and then doing some geometric construction and they mess up in very funny ways. These models can't "see" very well. I think [2] is still very relevant.

[1]: https://chatgpt.com/share/6941c96c-c160-8005-bea6-c809e58591...

[2]: https://vlmsareblind.github.io/

alasanoyesterday at 8:14 PM

It's still not available in the API despite them announcing the availability.

They even linked to their Image Playground where it's also not available..

I updated my local playground to support it and I'm just handling the 404 on the model gracefully

https://github.com/alasano/gpt-image-1-playground

show 3 replies
xnxyesterday at 7:50 PM

Great to have continued competition in the different model types.

What angle is there for second tier models? Could the future for OpenAI be providing a cheaper option when you don't need the best? It seems like that segment would also be dominated by the leading models.

I would imagine the future shakes out as: first class hosted models, hosted uncensored models, local models.

srousseytoday at 7:10 AM

“ Photo of a blond male in his 50s with half gray hair “

Still fails. Every photo of a man with half gray hair will have the other half black.

smlavineyesterday at 10:20 PM

This is terrifying. Truth is dead.

show 2 replies
zkmonyesterday at 8:56 PM

AI-generated images would remove all the trust and admire for human talent in art, similar to how text-generation would remove trust and admire for human talent in writing. Same case for coding.

So, let's simulate that future. Since no one trusts your talent in coding, art or writing, you wouldn't care to do any of these. But the economy is built on the products and services which get their value based how much of human talent and effort is required to produce them.

So, the value of these services and products goes down as demand and trust goes down. No one knows or cares who is a good programmer in the team, who is great thinker and writer and who is a modern Picasso.

So, the motivation disappears for humans. There are no achievements to target, there is no way to impress others with your talent. This should lead to uniform workforce without much difference in talents. Pretty much a robot army.

show 1 reply
KaiserProyesterday at 8:47 PM

Is there a watermarking, or some other way for normal people to tell if its fake?

show 6 replies
neomyesterday at 7:11 PM

Anyone else have issues verifying with openai? I always get a "congrats you're done" screen with a green checkmark from Persona, nothing to click, and my account stays unverified. (Edit, mystically, it's fixed..!)

sfmikeyesterday at 8:38 PM

Hope to see more "red alert" status from the ai wars putting companies into al hands on deck. This is only helping cost of tokens and efficacy. As always competition only helps the end users.

gs17yesterday at 9:14 PM

> Still some scientific inaccuracies, but ~70% correct

That's still dangerously bad for the use-case they're proposing. We don't need better looking but completely wrong infographics.

show 2 replies
Garleftoday at 9:38 AM

GPT images is the new MS Word "Arial + clip art"

surrTurryesterday at 8:36 PM

not super impressed. feels like 70% as good as nano banana pro.

andaitoday at 7:03 AM

Sam Altman Christmas decoration isn't real, he can't hurt me...

sipsitoday at 12:10 PM

the combination of two images the last gpt-image (nano banana) generated seem to be inappropriate

ezeroyesterday at 7:17 PM

Even from their own curated examples, this looks quite a bit worse than nano banan in terms of preserving consistency on image edits.

show 2 replies
ge96yesterday at 10:29 PM

I get the tech implementation is amazing, I wonder if it takes away from genuineness of events, like the Astronaut photo, I get it's just a joke/funny too but it's like a photo of you in a supercar vs. actually buying one. Or fake AI companions vs. real people. Beauty filters/skinny filters vs. actually being healthy.

show 2 replies
celerydyesterday at 9:28 PM

If it can't generate non-sexual content of a woman in a bikini, I am not interested.

etermyesterday at 11:42 PM

I have a "go to" prompt for images:

> In the style of a 1970s book sci-fi novel cover: A spacer walks towards the frame. In the background his spaceship crashed on an icy remote planet. The sky behind is dark and full of stars.

Nano banana pro via gemini did really well, although still way too detailed, and it then made a mess of different decades when I asked it to follow up: https://gemini.google.com/share/1902c11fd755

It's therefore really disappointing that GPT-image 1.5 did this:

https://chatgpt.com/share/6941ed28-ed80-8000-b817-b174daa922...

Completely generic, not at all like a book cover, it completely ignored that part of the prompt while it focused on the other elements.

Did it get the other details right? Sure, maybe even better, but the important part it just ignored completely.

And it's doing even worse when I try to get it to correct the mistake. It's just repeating the same thing with more "weathering".

show 1 reply
raw_anon_1111today at 1:36 AM

I still can’t get it to draw a “13 hour clock” correctly

show 1 reply
v9vtoday at 7:11 AM

Lots of em-dashes in this copy.

GaryBlutotoday at 3:07 AM

God OpenAI are so far behind. Their own example shows that trying to only change specific parts of the image doesn't work without affecting the background.

dzongayesterday at 8:40 PM

we seriously can't be burning GW of energy just to have sama in a GPT-Shirt Ad generated by A.I

impressive stuff though - as you can give it a base image + prompt.

show 2 replies
mohsen1yesterday at 8:02 PM

Unlike Nano Banana it allows generating photos of children. Always fun to ask AI to imagine children of a couple but it's also kinda concerning that there might be terrible use cases.

show 3 replies
pdevryesterday at 7:54 PM

>Now remove the two men, just keep the dog, and put them in an OpenAI livestream that looks like the attached image.

Where is the image given along with the prompt? If I didn't miss it: Would have been nice to show the attached image.

show 1 reply
0daymanyesterday at 8:16 PM

nah Nano Banana Pro is much better

nightshift1today at 12:11 AM

What is the endgame? Why is OpenAI throwing that much money on image/video generation? Is there a profitable market for AI-generated image slop? Do people choose ChatGPT instead of Gemini/Grok/Claude because of the image generation capabilities? To me, it looks like a huge fiery money pit.

show 1 reply
StarterProyesterday at 8:23 PM

In the image they showed for the new one, the mechanic was checking a dipstick...that was still in the vehicle.

I really hope everyone is starting to get disillusioned with OpenAI. They're just charging you more and more for what? Shitty images that are easy to sniff out?

In that case, I have a startup for you to invest in. Its a bridge-selling app.

show 1 reply
enigma101yesterday at 10:41 PM

Really can't stand the image slop suffocating the internet.

randalltoday at 1:01 AM

double popped collar ftw

catigulayesterday at 7:57 PM

Nano Banana Pro is so good that any other attempt feels 1-2 generations behind.

show 1 reply
ares623yesterday at 9:38 PM

My copium is that analog photography makes a come back as a way to recover some level of trust and authenticity.

show 2 replies
thumbsup-_-today at 4:16 AM

now you can create good memories with your family without meeting them

show 1 reply

🔗 View 11 more comments