logoalt Hacker News

ls612today at 2:59 PM0 repliesview on HN

The estimates I've seen are that running inference at scale on a Deepseek V3 sized model (so 700B parameters) costs roughly $0.70/mtok or so given current H100 rental costs. Sonnet charges $15/mtok on the API so the delta between the true cost and the API cost is quite large, to the point where even many subscription users are likely profitable.