The article specifically points out that these polynomials only work well on specific intervals (emp...

ForceBru • 04/22/2025 • 2 replies • view on HN

The article specifically points out that these polynomials only work well on specific intervals (emphasis copied from the article):

"The second source of their bad reputation is misunderstanding of Weierstrass’ approximation theorem. It’s usually cited as “polynomials can approximate arbitrary continuous functions”. But that’s not entrely true. They can approximate arbitrary continuous functions in an interval. This means that when using polynomial features, the data must be normalized to lie in an interval. It can be done using min-max scaling, computing empirical quantiles, or passing the feature through a sigmoid. But we should avoid the use of polynomials on raw un-normalized features."

As I understand it, one of the main ideas of this series of posts is that normalizing features to very specific intervals is important when fitting polynomials. I don't think this "went completely uncommented".

Replies

nestorD • 04/22/2025

Yes! And the next articles in the series double down on this:

"Any polynomial basis has a “natural domain” where its approximation properties are well-known. Raw features must be normalized to that domain. The natural domain of the Bernstein basis is the interval [0,1][0,1]."

➕ show 1 reply

constantcrying • 04/22/2025

The quote has absolutely nothing to do with my point.

The scaling to an interval in the quote is about formal mathematical reasons,in particular that polynomials do not approximate continuous functions globally. This is totally unrelated to numerics.

The issue is that in particular the interval 0 to 1 has to be chosen, as otherwise the numerics totally fall apart. The message of the article is that high degree polynomials pose no danger, but that is wrong. All the examples in the article only work because of a specific choice of interval. All the major numerical issues are totally ignored, which would immediately invalid the core thesis of the article. If you calculate 10^100 in 64 bit floating point you will run into trouble. The article pretends that will not be the case.

➕ show 1 reply

alt Hacker News

Replies