> So... are _all_ introductions to machine learning just extremely wrong here?
It's more of a heuristic. Most people have their first experience in Excel, where you can fit a polynomial. Cranking up degree will always improve r2 (since excel doesn't do a holdout), so it's a very common mistake new students make.
It's much more understandable at the beginner level to say "you'll overfit if you crank up the degree" than it is to explain regularization and basises. Later on you can introduce it, but early on it's confusing and distracting to students who might not even know how to solve for an ordinary least squares model.