https://arxiv.org/pdf/2503.02113
This paper shows that polynomials show most features of deep neural nets, including double descent and ability to memorize entire dataset.
It connects dots there - polynomials there are regularized to be as simple as possible and author argues that hundredths of billions of parameters in modern neural networks work as a regularizers too, they attenuate decisions that "too risky."
I really enjoyed that paper, a gem that puts light everywhere.