How does it compare to partially fine-tuning the model by freezing most of the network beside the la...

ismailmaj • 11/08/2024 • 1 reply • view on HN

How does it compare to partially fine-tuning the model by freezing most of the network beside the last few layers?

Replies

K0balt • 11/09/2024

Idk but if I was guessing, I would guess that that process would be likely to create intruder dimensions in those layers… but hard to say how impactful that would be. Intuitively I would think it would tend to channel a lot of irrelevant outputs towards the semantic space of the new training data, but idk I how well that intuition would hold up to reality.

alt Hacker News

Replies