logoalt Hacker News

mlyleyesterday at 6:08 PM1 replyview on HN

The pay-per-use API sucks. If you end up on the $50/mo plan, it's better, with caveats:

1 million tokens per minute, 24 million tokens per day. BUT: cached tokens count full, so if you have 100,000 tokens of context you can burn a minute of tokens in a few requests.


Replies

solarkrafttoday at 12:19 AM

It’s wild that cached tokens count full - what’s in it for you to care about caching at all then? Is the processing speed gain significant?

show 1 reply