One thing to consider is that this version is a new architecture, so it’ll take time for Llama CPP t...

dumbmrblah • yesterday at 5:33 PM • 1 reply • view on HN

One thing to consider is that this version is a new architecture, so it’ll take time for Llama CPP to get updated. Similar to how it was with Qwen Next.

Replies

cristoperb • yesterday at 7:55 PM

Apparently it is the same as the DeepseekV3 architecture and already supported by llama.cpp once the new name is added. Here's the PR: https://github.com/ggml-org/llama.cpp/pull/18936

➕ show 1 reply

alt Hacker News

Replies