logoalt Hacker News

fjdjshshyesterday at 6:46 PM1 replyview on HN

I get your point, but I don't think your nit-pick is useful in this case.

The point is that you can't abstract away the details of back propagation (which involve computing gradients) under some circumstances. For example, when we are using gradient descend. Maybe in other circumstances (global optimization algorithm) it wouldn't be an issue, but the leaky abstraction idea isn't that the abstraction is always an issue.

(Right now, back propagation is virtually the only way to calculate gradients in deep learning)


Replies

nirinortoday at 2:19 AM

So, are computing gradients details of backpropagation that it is failing to abstract over, or are gradients the goal that backpropagation achieves? It isn't both, its just the latter.

This is like complaining about long division not behaving nicely when dividing by 0. The algorithm isn't the problem, and blaming the wrong part does not help understanding.

It distracts from what is actually helping which is using different functions with nicer behaving gradients, e.g., the Huber loss instead of quadratic.

show 2 replies