No. All the textbooks know that polynomials of high degree are numerically dangerous and you need to be careful when handling them.
The articles examples only work because the interval 0 to 1 (or -1 to 1) were chosen. For whatever reason the author does not point that out or even acknowledges the fact that had he chosen a larger interval the limitations of floating point arithmetic would have ruined the argument he was trying to make.
10^100 is a very large number and numerically difficult to treat. For whatever reason the author pretends this is not a valid reason to be cautious about high degree polynomials.
Actually they aren't. You never compute high powers of the argument when working with specialized bases.
You use the recursive formula that both the Bernstein basis and the orthogonal polynomial bases are endowed with. This is implemented in numpy, so you don't have to do anything yourself. Just call, for example, np.polynomial.legendre.legvander to get the features for the Legendre basis.
And a basis orthogonal over [-1,1] is easily made orthogonal over arbitrary interval. Take p_i to be the i-th legendre polynomial, then the basis composed of q_i(x)=p_i(2(x-a)/(b-a)-1) is orthogonal over [a,b]. Each q_i is itself a polynomial of degree i, but you never use its coefficients explicitly.
There is an entire library for computing with polynomial apptoximants of functions over arbitrary intervala using orthogonal polynomials - Chebfun. The entire scientific and spectral differential equations community knows there are no numerical issues working with high degree polynomials over arbitrary intervals.
The ML community just hasn't caught up.
Neural network training is harder when the input range is allowed to deviate from [-1, 1]. The only reason why it sometimes works for neural networks is because the first layer has a chance to normalize it.
If you have a set of points, why can't you just map them to an interval [0, 1] before fitting the polynomial?
He seems reasonably explicit about this:
""
This means that when using polynomial features, the data must be normalized to lie in an interval. It can be done using min-max scaling, computing empirical quantiles, or passing the feature through a sigmoid. But we should avoid the use of polynomials on raw un-normalized features.
""