logoalt Hacker News

kalkintoday at 5:04 PM4 repliesview on HN

AFAICT this uses a token-counting API so that it counts how many tokens are in the prompt, in two ways, so it's measuring the tokenizer change in isolation. Smarter models also sometimes produce shorter outputs and therefore fewer output tokens. That doesn't mean Opus 4.7 necessarily nets out cheaper, it might still be more expensive, but this comparison isn't really very useful.


Replies

h14htoday at 5:16 PM

For some real data, Artificial Analysis reported that 4.6 (max) and 4.7 (max) used 160M tokens and 100M tokens to complete their benchmark suite, respectively:

https://artificialanalysis.ai/?intelligence-efficiency=intel...

Looking at their cost breakdown, while input cost rose by $800, output cost dropped by $1400. Granted whether output offsets input will be very use-case dependent, and I imagine the delta is a lot closer at lower effort levels.

show 1 reply
SkyPunchertoday at 5:42 PM

Yes. I actually noticed my token usage go down on 4.6 when I started switching every session to max effort. I got work done faster with fewer steps because thinking corrected itself before it cycled.

I’ve noticed 4.7 cycling a lot more on basic tasks. Though, it also seems a bit better at holding long running context.

manmaltoday at 5:06 PM

Why is it not useful? Input token pricing is the same for 4.7. The same prompt costs roughly 30% more now, for input.

show 3 replies
the_gipsytoday at 6:00 PM

With AIs, it seems like there never is a comparison that is useful.

show 2 replies