You are right, which is why I do not intend to use a GGUF file but a set of files with a different l...

adrian_b • today at 10:26 AM • 1 reply • view on HN

You are right, which is why I do not intend to use a GGUF file but a set of files with a different layout, and this is why I need to make changes in llama.cpp.

Replies

zozbot234 • today at 1:58 PM

If you have to come up with a custom format anyway, why not just make it a draft extension to GGUF layout definitions (something like "coalesced expert fetch" or the like) and submit it for inclusion in the standard? Then future models could be autoconverted to such a format.

alt Hacker News

Replies