logoalt Hacker News

tedsandersyesterday at 6:44 PM9 repliesview on HN

Yep, the point we wanted to make here is that GPT-5.2's vision is better, not perfect. Cherrypicking a perfect output would actually mislead readers, and that wasn't our intent.


Replies

BoppreHyesterday at 7:28 PM

That would be a laudable goal, but I feel like it's contradicted by the text:

> Even on a low-quality image, GPT‑5.2 identifies the main regions and places boxes that roughly match the true locations of each component

I would not consider it to have "identified the main regions" or to have "roughly matched the true locations" when ~1/3 of the boxes have incorrect labels. The remark "even on a low-quality image" is not helping either.

Edit: credit where credit is due, the recently-added disclaimer is nice:

> Both models make clear mistakes, but GPT‑5.2 shows better comprehension of the image.

show 4 replies
arscanyesterday at 7:34 PM

I think you may have inadvertently misled readers in a different way. I feel misled after not catching the errors myself, assuming it was broadly correct, and then coming across this observation here. Might be worth mentioning this is better but still inaccurate. Just a bit of feedback, I appreciate you are willing to show non-cherry-picked examples and are engaging with this question here.

Edit: As mentioned by @tedsanders below, the post was edited to include clarifying language such as: “Both models make clear mistakes, but GPT‑5.2 shows better comprehension of the image.”

show 1 reply
layer8yesterday at 7:50 PM

You know what would be great? If it had added some boxes with “might be X or Y, but not sure”.

g947oyesterday at 7:41 PM

When I saw that it labeled DP ports as HDMI I immediately decided that I am not going to touch this until it is at least 5x better with 95% accuracy with basic things.

I don't see any advantage in using the tool.

show 1 reply
iwontberudeyesterday at 8:00 PM

But it’s completely wrong.

iamdanieljohnsyesterday at 7:42 PM

Is Adaptive Reasoning gone from GPT-5.2? It was a big part of the release of 5.1 and Codex-Max. Really felt like the future.

show 1 reply
d--byesterday at 7:16 PM

[flagged]

show 2 replies
johnwheeleryesterday at 11:00 PM

Oh and you guys don't mislead people ever. Your management is just completely trustworthy, and I'm sure all you guys are too. Give me a break, man. If I were you, I would jump ship or you're going to be like a Theranos employee on LinkedIn.

show 1 reply