Hey thanks - yes agreed - for now we do: 1. Split metadata into shard 0 for huge models so 10B is ...

danielhanchen • yesterday at 4:55 PM • 2 replies • view on HN

Hey thanks - yes agreed - for now we do:

1. Split metadata into shard 0 for huge models so 10B is for chat template fixes - however sometimes fixes cause a recalculation of the imatrix, which means all quants have to be re-made

2. Add HF discussion posts on each model talking about what changed, and on our Reddit and Twitter

3. Hugging Face XET now has de-duplication downloading of shards, so generally redownloading 100GB models again should be much faster - it chunks 100GB into small chunks and hashes them, and only downloads the shards which have changed

Replies

ssrshh • today at 1:37 AM

If you would know - is this also why LM Studio and Ollama model downloads often fail with a signature mismatch error?

evilduck • yesterday at 7:40 PM

Ah thanks, I wasn't aware of #3, that should be a huge boon.

alt Hacker News

Replies