That is... astronomically different. Is GPT-5.1 downscaling and losing critical information or somet...

jasonjmcghee • yesterday at 8:07 PM • 3 replies • view on HN

That is... astronomically different. Is GPT-5.1 downscaling and losing critical information or something? How could it be so different?

Replies

energy123 • yesterday at 11:28 PM

This is my default explanation for visual impairments in LLMs, they're trying to compress the image into about 3000 tokens, you're going to lose a lot in the name of efficiency.

zubiaur • yesterday at 11:17 PM

It has a rather poor max resolution. Higher resolution images get tiled up to a point. 512 x 512, I think is the max tile size, 2048 x 2048 the max canvas.

ericd • yesterday at 9:07 PM

I found much better results with smallish UI elements in large screenshots on GPT by slicing it up manually and feeding them one at a time. I think it does severely lossy downscaling.

alt Hacker News

Replies