logoalt Hacker News

alizaid10/11/20241 replyview on HN

Grokking is fascinating! It seems tied to how neural networks hit critical points in generalization. Could this concept also enhance efficiency in models dealing with non-linearly separable data?


Replies

wslh10/11/2024

Could you expand about grokking [1]? I superficially understand what it means but it seems more important that the article conveys.

Particularly:

> Grokking can be understood as a phase transition during the training process. While grokking has been thought of as largely a phenomenon of relatively shallow models, grokking has been observed in deep neural networks and non-neural models and is the subject of active research.

Does that paper add more insights?

[1] https://en.wikipedia.org/wiki/Grokking_(machine_learning)?wp...

show 1 reply