logoalt Hacker News

zargonyesterday at 5:56 PM1 replyview on HN

Why do you merge the GGUFs? The 50 GB files are more manageable (IMO) and you can verify checksums as you say.


Replies

sowbugyesterday at 7:19 PM

I admit it's a habit that's probably weeks out of date. Earlier engines barfed on split GGUFs, but support is a lot better now. Frontends didn't always infer the model name correctly from the first chunk's filename, but once llama.cpp added the models.ini feature, that objection went away.

The purist in me feels the 50GB chunks are a temporary artifact of Hugging Face's uploading requirements, and the authoritative model file should be the merged one. I am unable to articulate any practical reason why this matters.