It also took me a little while to realize “least squares” and MMSE approaches were not necessarily the “correct” way to do things but just “one thing we actually know how to do” because everything else is much harder.
We can use Calculus to do so much but also so little…
That isn't the case; mathematicians will do pages of calculations (particularly and especially the statisticians) if they can prove one approach is technically superior to another. These people, as a class, are the crazies who invented matrix multiplication. Something like MMSE is used because it provably optimum properties for estimating a posterior distribution.
It is certainly possible that there are complex approaches that the statisticians have not discovered or don't teach because they are too complicated, but they had a big fight about which techniques were provably superior early in the discipline's history and the choices of what got standardised on weren't because of ease of calculation. It has actually been quite interesting how little interest the statisticians are likely to be taking in things like the machine learning revolution since the mathematics all seems pretty amenable to last century's techniques despite orders of magnitude differences in the data being handled.