Is it worth running speculative decoding on small active models like this? Or does MTP make speculative decoding unnecessary?