The estimates I've seen are that running inference at scale on a Deepseek V3 sized model (so 70...

ls612 • today at 2:59 PM • 0 replies • view on HN

The estimates I've seen are that running inference at scale on a Deepseek V3 sized model (so 700B parameters) costs roughly $0.70/mtok or so given current H100 rental costs. Sonnet charges $15/mtok on the API so the delta between the true cost and the API cost is quite large, to the point where even many subscription users are likely profitable.

alt Hacker News