Being grafted onto the main model reduces layer duplication that you’d otherwise have: at least for ...

girvo • last Saturday at 10:50 PM • 1 reply • view on HN

Being grafted onto the main model reduces layer duplication that you’d otherwise have: at least for Step and Qwen 3.6

Replies

Step 2.7’s MTP seems broken (at least for ik_llama.cpp) where the draft model starts and ends in block 3 but ik_llama bails out looking for block 0 :(

➕ show 1 reply

alt Hacker News

Replies