logoalt Hacker News

dist-epochyesterday at 4:18 PM1 replyview on HN

What do you think about creating a tool which can just patch the template embedded in the .gguf file instead of forcing a re-download? The whole file hash can be checked afterwards.


Replies

danielhanchenyesterday at 4:58 PM

Sadly it's not always chat template fixes :( But yes we now split the first shard as pure metadata (10MB) for huge models - these include the chat template etc - so you only need to download that.

For serious fixes, sadly we have to re-compute imatrix since the activation patterns have changed - this sadly makes the entire quant change a lot, hence you have to re-download :(