See also this stackexchange answer, which makes basically the same point:

ssegert • 04/23/2025 • 0 replies • view on HN

https://stats.stackexchange.com/questions/560383/how-can-we-...

I think the posted article has some important confusions. First is the distinction between a basis and an objective. If I fit a polynomial to go through every point in my dataset, I will end up with exactly the same resulting solution, it doesn't matter what basis I use to represent it (although the basis-specific coefficients will of course differ). This is because all polynomial bases (by definition) represent the same hypothesis class, and the recipe "fit a polynomial that passes through every point" can be expressed as an optimization problem over that hypothesis class. More generally, any two bases will give exactly the same solution when optimizing for the same objective (ignoring things like floating point errors).

So why do the results with Bernstein basis look better than for the standard basis? It's because they are actually optimizing a different objective function (which is explicitly written out in the stackexchange post). So in that sense the Bernstein-vs.-standard basis is really a comparison between different objective functions rather than between different bases. It so happens that the solution to the new objective function has a simple expression in terms of the Bernstein basis; but in principle I could set up and solve the objective in terms of the standard basis and obtain exactly the same result. From this perspective, the Bernstein polynomials are nothing more than a convenient computational device for expressing the solution to the objective. The OP kind of gets at this with the distinction between Fitting and Interpolation, but seems to conflate different fitting procedures with different bases.

Secondly, there is actually no need for explicit regularization when fitting Bernstein polynomials (cf. the stackexchange post). The way the OP fits Bernstein polynomials is non-standard. Although one can say that the specific form of the objective function provides an important source of implicit regularization.

So in conclusion I think the OP has importantly mis-attributed the source of success of the Bernstein method. It does not have to do with explicit regularization or choice of basis, and has everything to do with re-defining the objective function.

alt Hacker News