logoalt Hacker News

imtringuedyesterday at 10:43 AM0 repliesview on HN

Gradient descent doesn't matter here. Second order and higher methods still use lower order derivatives.

Back propagation is reverse mode auto differentiation. They are the same thing.

And for those who don't understand what back propagation is, it is just an efficient method to calculate the gradient for all parameters.