logoalt Hacker News

theophrastuslast Sunday at 11:11 PM5 repliesview on HN

Had a QuantSci Prof who was fond of asking "Who can name a data collection scenario where the x data has no error?" and then taught Deming regression as a generally preferred analysis [1]

[1] https://en.wikipedia.org/wiki/Deming_regression


Replies

moregristlast Monday at 1:19 AM

Most of the time, if you have a sensor that you sample at, say 1 KHz and you’re using a reliable MCU and clock, the noise terms in the sensor will vastly dominate the jitter of sampling.

So for a lot of sensor data, the error in the Y coordinate is orders of magnitude higher than the error in the X coordinate and you can essentially neglect X errors.

show 1 reply
jmpeaxlast Sunday at 11:32 PM

From that wikipedia article, delta is the ratio of y variance to x variance. If x variance is tiny compared to y variance (often the case in practice) then will we not get an ill-conditioned model due to the large delta?

show 1 reply
ghclast Monday at 1:26 PM

In my field, the X data error (measurement jitter) is generally <10ns, which might as well be no error.

Beretta_Vexeelast Monday at 9:28 AM

For most time series, noise in time measurement is negligible. However, this does not prevent complex coupling phenomena from occurring for other parameters, such as GPS coordinates.

RA_Fisherlast Monday at 11:48 AM

The issue in that case is that OLS is BLUE, the best linear unbiased estimator (best in the sense of minimum variance). This property is what makes OLS exceptional.