logoalt Hacker News

storystarlinglast Tuesday at 7:39 PM1 replyview on HN

The unit economics seem pretty rough though. You're locking up 8xH100s for the compute of ~32B active parameters. I guess memory is the bottleneck but hard to see how the margins work on that.


Replies

kristianpyesterday at 8:59 PM

Yes, it only makes sense economically if you have batching over many users.