logoalt Hacker News

ratoday at 5:32 AM1 replyview on HN

Nice. I've been thinking of doing something similar in our local jurisdiction (Australia).

Are you able to share (or point me toward) any high-level details: (key hardware, hosting stack, high-level economics, key challenges)?

I'd love to offer to buy you a coffee but I won't be in Switzerland any time soon.


Replies

sacrelegetoday at 10:45 AM

Ah thanks, I love coffee

At a high level, it's a mix of our own GPU capacity plus the ability to burst into external nodes when things get busy. Right now we're running a bunch of RTX PRO 6000s, which basically forces you into workstation/server boards since you need full x16 PCIe 5.0 lanes per card.

We operate a small private datacenter, which gives us some flexibility in how we deploy and scale hardware. On the software side, we're currently LiteLLM as a load balancer in front of the inference servers, though I'm in the process of replacing that with a custom rust based implementation.

We've only been online since the beginning of this month, so I can't really say much about the economics yet, but we've had some really nice feedback from early customers so far. :)