logoalt Hacker News

himata4113today at 7:46 AM2 repliesview on HN

What people don't realize is that cache is *free*, well not free, but compared to the compute required to recompute it? Relatively free.

If you remove the cached token cost from pricing the overall api usage drops from around $5000 to $800 (or $200 per week) on the $200 max subscription. Still 4x cheaper over API, but not costing money either - if I had to guess it's break even as the compute is most likely going idle otherwise.


Replies

criementoday at 9:08 AM

> What people don't realize is that cache is free

I'm incredibly salty about this - they're essentially monetizing intensely something that allows them to sell their inference at premium prices to more users - without any caching, they'd have much less capacity available.

erutoday at 7:49 AM

> [...] if I had to guess it's break even as the compute is most likely going idle otherwise.

Why would it go idle? It would go to their next best use. At least they could help with model training or let their researchers run experiments etc.

show 1 reply