logoalt Hacker News

wslh10/11/20241 replyview on HN

Could you expand about grokking [1]? I superficially understand what it means but it seems more important that the article conveys.

Particularly:

> Grokking can be understood as a phase transition during the training process. While grokking has been thought of as largely a phenomenon of relatively shallow models, grokking has been observed in deep neural networks and non-neural models and is the subject of active research.

Does that paper add more insights?

[1] https://en.wikipedia.org/wiki/Grokking_(machine_learning)?wp...


Replies

tanananinena10/11/2024

This is probably the most interesting (and insightful) paper on grokking I’ve read recently: https://arxiv.org/abs/2402.15555