logoalt Hacker News

themgttoday at 1:12 AM1 replyview on HN

afaik there's somewhat painful economics. Not sure back-of-napkin but something like:

   • 150-500B: Sonnet
   • 0.9-2T: Opus
   • 3-5T/10T: Fable / Mythos
So if bigger model is "smarter" but you effectively wind up with a "shared hosting" model where a coherent inherence node(s) that cost $2m or something can run max 10x customer workloads simultaneously ... not sure what that can be priced at.

If it turns out a $10m/10x shared node can host even smarter models, then what?


Replies

versteegentoday at 2:36 AM

Fable/Mythos are based on the same model. Not totally clear whether they have identical weights (just different external guardrails), or there's also some slight finetuning difference.