logoalt Hacker News

somewhatrandom9today at 6:09 PM0 repliesview on HN

Could these quantized models make MTP (Multi-Token Prediction) faster when used in conjunction with larger Gemma 4 models?