The unit economics seem tough at that price for a 1T parameter model. Even with MoE sparsity you are...

storystarling • today at 8:41 AM • 0 replies • view on HN

The unit economics seem tough at that price for a 1T parameter model. Even with MoE sparsity you are still VRAM bound just keeping the weights resident, which is a much higher baseline cost than serving a smaller model like Haiku.

alt Hacker News