I wish they would release the requirements to run on llama.cpp with any announcements of open models...

aetherspawn • today at 2:31 AM • 1 reply • view on HN

I wish they would release the requirements to run on llama.cpp with any announcements of open models.

A bonus would be tok/s on common hardware.

Replies

I don't think llama.cpp supports any of the LongCat models, actually.

They haven't posted weights/inference solutions for LongCat-2.0 [1], but LongCat-Next had transformers support, which I assume means it works with vLLM/SGLang.

Given it's 1.6T, "common hardware" is probably out of the question; even 2bpw is going to measure out at 400GB, even before considering the bandwidth requirements for 48B active. I haven't read the LongCat-2.0 architecture docs, but if you're not running GLM-5.2, you're probably not running this either.

[1] https://huggingface.co/meituan-longcat/LongCat-2.0: "Model weights coming soon — stay tuned!"

➕ show 1 reply

alt Hacker News

Replies