logoalt Hacker News

KeplerBoytoday at 8:42 AM1 replyview on HN

It's not that expensive unless you run millions of tokens through an agent. For use cases where you actually read all the input and output by yourself (i.e. an actual conversation), it is insanely cheap.


Replies

tucnaktoday at 10:59 AM

Yeah in my last job, unsupervised dataset-scale transformations amounted to 97% of all spending. We were using gemini 2.5 flash in batch/prefill-caching mode in Vertex, and always the latest/brightest for ChatGPT-like conversations.