Step 2.7’s MTP seems broken (at least for ik_llama.cpp) where the draft model starts and ends in blo...

alfiedotwtf • last Sunday at 11:18 AM • 1 reply • view on HN

Step 2.7’s MTP seems broken (at least for ik_llama.cpp) where the draft model starts and ends in block 3 but ik_llama bails out looking for block 0 :(

Replies

girvo • last Sunday at 12:37 PM

Aw that’s a shame; I’m running the official llama.cpp on my Spark-alike, and it works great now. Proper triple head too which is what it is trained on, gets me up to 35-40tk/s decode

alt Hacker News

Replies