logoalt Hacker News

girvotoday at 9:23 AM1 replyview on HN

It's being explored right now for speculative decoding in the local-LLM space, which I think is quite interesting as a use-case

https://www.emergentmind.com/topics/dflash-block-diffusion-f...


Replies

roger_today at 11:30 AM

DFlash immediately came to my mind.

There are several Mac implementations of it that show > 2x faster Qwen3.5 already.