Anyone want to bet that much like speculative execution, speculative decoding is going to introduce a whole slew of vulnerabilities in the ways LLMs work?
Don't think so because all tokens predicted speculatively are still validated against the main model (which is faster than predicting them from scratch) and only accepted if they match exactly.
Don't think so because all tokens predicted speculatively are still validated against the main model (which is faster than predicting them from scratch) and only accepted if they match exactly.