logoalt Hacker News

adrian_btoday at 10:26 AM1 replyview on HN

You are right, which is why I do not intend to use a GGUF file but a set of files with a different layout, and this is why I need to make changes in llama.cpp.


Replies

zozbot234today at 1:58 PM

If you have to come up with a custom format anyway, why not just make it a draft extension to GGUF layout definitions (something like "coalesced expert fetch" or the like) and submit it for inclusion in the standard? Then future models could be autoconverted to such a format.