You can think of it as: linear regression models only noise in y and not x, whereas ellipse/eigenvector of the PCA models noise in both x and y.
So when fitting a trend, e.g. for data analytics, should we use eigenvector of the PCA instead of linear regression?
Is there any way to improve upon the fit if we know that e.g. y is n times as noisy as x? Or more generally, if we know the (approximate) noise distribution for each free variable?
It might be cool to train neural network by minimizing error with assumption there's noise on both inputs and outputs.
That brings up an interesting issue, which is that many systems do have more noise in y than in x. For instance, time series data from an analog-to-digital converter, where time is based on a crystal oscillator.