logoalt Hacker News

julianlamyesterday at 7:22 PM2 repliesview on HN

Does this mean there will be new Gemma 4 models released with MTP, or are they already available in existing models + quants?


Replies

adrian_byesterday at 8:37 PM

For each of the 4 gemma-4-*-it models there has been published an associated small model gemma-4-*-it-assistant, to be used for MTP.

If a GGUF file is generated for MTP, it must include both the big model and the small model. There was a reference in another comment to a PR for llama.cpp, which also included updates for the Python program used for conversion from the safetensors files, which presumably can handle the combining of the two paired Gemma 4 models.

jugyesterday at 9:22 PM

They have now been released on e.g Hugging Face with model suffixes "-assistant".