logoalt Hacker News

kgeistyesterday at 9:00 PM3 repliesview on HN

We're planning to do the same thing - buy something like 8xH100 and run all coding there. The CTO almost agreed to find the budget for it but I need to make sure there are no risks before we buy (i.e. it's a viable/usable setup for professional AI-assisted coding)

Can you share what models you run and find best performing for this setup? That would help a lot. I already run a smaller AI server in the office but only 32b models fit there. I already have experience optimizing inference, I'm just interested what models you think are great for 8xH100 for coding, I'll figure out the details how to fit it :)


Replies

doolstoday at 2:02 AM

Check out Verda you can rent whatever super powerful GPU clusters you need in 10 minute increments. Deploy any open weight model using SGLang and away you go

htrptoday at 1:43 AM

8 x h100 80's don't give you enough to run the latest 1tn + parameter models (especially at the context window lengths to be competitive with the frontier models)

show 1 reply
Havocyesterday at 11:53 PM

Deepseek, GLM, Minimax or Kimi are the most likely contenders.

show 1 reply