> GPT‑5.5 improves on GPT‑5.4’s scores while using fewer tokens. This might be great if it tran...

jumploops • yesterday at 6:17 PM • 2 replies • view on HN

> GPT‑5.5 improves on GPT‑5.4’s scores while using fewer tokens.

This might be great if it translates to agentic engineering and not just benchmarks.

It seems some of the gains from Opus 4.6 to 4.7 required more tokens, not less.

Maybe more interesting is that they’ve used codex to improve model inference latency. iirc this is a new (expectedly larger) pretrain, so it’s presumably slower to serve.

Replies

beering • yesterday at 6:30 PM

With Opus it’s hard to tell what was due to the tokenizer changes. Maybe using more tokens for the same prompt means the model effectively thinks more?

conradkay • yesterday at 6:29 PM

They say latency is the same as 5.4 and 5.5 is served on GB200 NVL72, so I assume 5.4 was served on hopper.

alt Hacker News

Replies