"GLM 5.2 is "almost Opus," and it needs at least 8xH200s for comfortable inference .....

rsync • yesterday at 8:20 PM • 1 reply • view on HN

"GLM 5.2 is "almost Opus," and it needs at least 8xH200s for comfortable inference ..."

What is the behavior if one were to run GLM 5.2 with only a single H200 ?

Would it fail to run at all, or would it just run so slowly as to be unusable ?

I would like to prove out the build, and concept, of a SOTA model locally, but then backfill the rest of the GPUs in 18-24 months when they cost significantly less ...

Replies

BoorishBears • yesterday at 10:56 PM

> in 18-24 months when they cost significantly less ...

going to need you to sit down for this one...

➕ show 1 reply

alt Hacker News

Replies