logoalt Hacker News

bcoatestoday at 4:19 AM1 replyview on HN

There's something I'm fundamentally missing here--if the standard basis and the Berenstain basis describe exactly the same set of polynomials of degree n, then surely the polynomial of degree n that minimizes the mean square error is singular (and independent of the basis--the error is the samples vs the approximation, the coefficients/basis are not involved) so both the standard basis and Berenstain basis solution are the same (pathological, overfitted, oscillating) curve?

Like I understand how the standard basis is pathological because the higher degree powers diverge like mad so given "reasonable" components the Berenstain basis is more likely to give "reasonable" curves but if you're already maximizing I don't understand how you arrive at a different curve.

What am I missing?


Replies

pavpanchekhatoday at 4:23 AM

The minimization is regularized, meaning you add a penalty term for large coefficients. The coefficients will be different for the two bases, meaning the regularization will work differently.

show 1 reply