I still remember learning about TAGE and preceptron predictors, and how machine learning and neural networks has long been, in some form, been used in CPU architecture design.
The simplest binary saturating counter, ala bimodal predictor, already achieved more than 90% success rate. What comes next is just extension around that, just use multiple bimodal predictors and build a forest of it, but the core idea that treating the branch prediction using a Bayesian approach, never fades.
It is a combined effort between hardware design and software compiler, though.