“Pricing is per token, no idle costs: GPT-OSS-120B is $0.02 in / $0.095 out, Qwen3.5-122B is $0...

reactordev • today at 7:42 PM • 1 reply • view on HN

“Pricing is per token, no idle costs: GPT-OSS-120B is $0.02 in / $0.095 out, Qwen3.5-122B is $0.20 in / $1.60 out. Full model list and pricing at https://ionrouter.io.”

Man you had me panicking there for a second. Per token?!? Turns out, it’s per million according to their site.

Cool concept. I used to run a Fortune 500’s cloud and GPU instances hot and ready were the biggest ask. We weren’t ready for that, cost wise, so we would only spin them up when absolutely necessary.

Replies

2uryaa • today at 9:19 PM

Haha sorry for the typo! Your F500 use case is exactly who we want to target, especially as they start serving finetunes on their own data. Thanks for the feedback!

alt Hacker News

Replies