A related viewpoint is that overparametrization is good because the model is stranded when the Hessi...

cherryteastain • today at 2:52 PM • 1 reply • view on HN

A related viewpoint is that overparametrization is good because the model is stranded when the Hessian has all positive/zero eigenvalues. If we treat the probability that a particular Hessian eigenvalue turns positive as a Bernoulli process, the chance of all eigenvalues going positive/zero exponentially decreases as the parameter count increases

[1] https://arxiv.org/abs/1406.2572

Replies

david-gpu • today at 3:04 PM

You don't need billions of parameters for that, precisely because the risk of being stuck at a local minimum decreases exponentially with the number of parameters. Right?

alt Hacker News

Replies