logoalt Hacker News

MuffinFlavoredyesterday at 8:20 PM1 replyview on HN

> Running DeepSeek V3 (685B) requires 8×H100 GPUs which is about $14k/month. Most developers only need 15-25 tok/s.

> deepseek-v3.2-685b, $40/mo/slot for ~20 tok/s, 465 slots total

> 465 users × 20 tok/s = 9,300 tok/s needed

> The node peaks at ~3,000 tok/s total. So at full capacity they can really only serve:

> 3,000 ÷ 20 = 150 concurrent users at 20 tok/s

> That's only 32% of the cohort being active simultaneously.


Replies

artificialprintyesterday at 8:22 PM

People work 8 hours a day presumably, I guess they are banking on this idea