logoalt Hacker News

mkllast Tuesday at 10:09 PM1 replyview on HN

The degree-10000 polynomial is definitely overfitting - every data point has its own little spike. The truncated curves look kind of nice, but the data points aren't shown on those plots, and the curves aren't very close to them.

There are also some enormous numbers hidden away here too, with associated floating point precision problems; the articles show the coefficients are small, but that's because the polynomial basis functions themselves have gigantic numbers in them. The Bernstein basis for degree-100 involves 100 choose 50, which is already > 10^29. You have to be careful calculating these polynomials or bits of your calculation exceed 64-bit floating point range, e.g. factorials of 10000, 2^10000. See the formulas and table in this section: https://en.wikipedia.org/wiki/Legendre_polynomials#Rodrigues...


Replies

scytheyesterday at 3:02 AM

You can probably calculate the value of P_n(x) using the equation:

P_n(x) = (2 - 1/n) x P_{n-1}(x) - (1 - 1/n) P_{n-2}(x)

This is numerically stable if x is in [0,1], but I haven't validated that you can get to very high n using floating point.