We're planning to do the same thing - buy something like 8xH100 and run all coding there. The C...

kgeist • yesterday at 9:00 PM • 3 replies • view on HN

We're planning to do the same thing - buy something like 8xH100 and run all coding there. The CTO almost agreed to find the budget for it but I need to make sure there are no risks before we buy (i.e. it's a viable/usable setup for professional AI-assisted coding)

Can you share what models you run and find best performing for this setup? That would help a lot. I already run a smaller AI server in the office but only 32b models fit there. I already have experience optimizing inference, I'm just interested what models you think are great for 8xH100 for coding, I'll figure out the details how to fit it :)

Replies

dools • today at 2:02 AM

Check out Verda you can rent whatever super powerful GPU clusters you need in 10 minute increments. Deploy any open weight model using SGLang and away you go

htrp • today at 1:43 AM

8 x h100 80's don't give you enough to run the latest 1tn + parameter models (especially at the context window lengths to be competitive with the frontier models)

➕ show 1 reply

Havoc • yesterday at 11:53 PM

Deepseek, GLM, Minimax or Kimi are the most likely contenders.

➕ show 1 reply

alt Hacker News

Replies