It's being explored right now for speculative decoding in the local-LLM space, which I think is...

girvo • today at 9:23 AM • 1 reply • view on HN

It's being explored right now for speculative decoding in the local-LLM space, which I think is quite interesting as a use-case

roger_ • today at 11:30 AM

DFlash immediately came to my mind.

There are several Mac implementations of it that show > 2x faster Qwen3.5 already.

alt Hacker News