Input: $0.95
Cache hit (most important): $0.19
Output: $4.00
This is the same as how much Moonshot charges for it, and it puts it at roughly the price of GPT 5.4 mini, not a bad option.
For some context here is a stupid prompt that wastes tokens: "Play a game of tic tac toe against yourself on a 5x5 board, you need 5 in a row to win."
It costs $0.006 on Kimi K2.7, and you get to see the whole raw reasoning trace.
GPT-5.4 mini costs $0.016 and its summarized.
And in case you are wondering both play incredibly stupidly.
Kimi:
A B C D E
1 . . . . .
2 . . . . .
3 X X X X X
4 . O O O O
5 . . . . .
GPT 5.4 mini: 1: X X X X X
2: O O . . .
3: . . O . .
4: . . . O .
5: . . . . ONice idea. I just asked Haiku to do the same in Claude Chat on iOS: it created a interactive react game, implemented the rules and let it play. Clever move for 1$ input and 5$ output, Anthropic!
While LLM models are bad at games, they are perfectly capable of writing a RL agent to train on the game itself.
when i will be extremely bored, I think I will make two models play chess against each other. I bet there's a chess benchmark / llm tournament already somewhere
[flagged]
Btw if anyone is wondering, GPT 5.5 does the same garbage as 5.4 mini for 4 times the cost.
Fable manages to make a reasonable game, at a cost of 40 cents.