If it uses half the tokens to complete a task, then doubling the cost is perfectly fine. But is that...

cbg0 • yesterday at 6:32 PM • 2 replies • view on HN

If it uses half the tokens to complete a task, then doubling the cost is perfectly fine. But is that actually true?

Replies

2001zhaozhao • yesterday at 6:36 PM

This happens with every new model release though. The model makes less mistakes and spends less time fixing them, resulting in a token usage reduction for the same difficulty of task. Almost any task other than straight boilerplate will benefit from this.

In the same vein, I would guess that Opus 4.7 is probably cheaper for most tasks than 4.6, even though the tokenizer uses more tokens for the same length of string.

➕ show 2 replies

jstummbillig • yesterday at 7:20 PM

We'll find out!

alt Hacker News

Replies