So this is like branch prediction for operating systems? Except we have probability baked into the model itself so it’s even more reliable.
similar idea, but the failure mode is better. a branch mispredict burns cycles. a bad guess here usually just means no bonus tokens. https://arxiv.org/abs/2211.17192
similar idea, but the failure mode is better. a branch mispredict burns cycles. a bad guess here usually just means no bonus tokens. https://arxiv.org/abs/2211.17192