Curious how this handles non-determinism. Most transformer inference has temperature > 0, which means the "program execution" is probabilistic. The interesting question is whether the speedup holds when you need consistent outputs across multiple calls.