TLDR: 1. Use alpha = 2*rank 2. Don't use too small ranks (rank=1 to 8) 3. Sensational title...

danielhanchen • 11/08/2024 • 1 reply • view on HN

TLDR: 1. Use alpha = 2*rank

2. Don't use too small ranks (rank=1 to 8)

3. Sensational title. Better title "LoRA works if done right"

4. Didn't test SVD init

Replies

Tostino • 11/09/2024

Thanks for the TLDR. Yeah, pretty much fits my experience, though I mainly cared about the specific task performance I was training rather than caring about regressing unrelated tasks.

➕ show 1 reply

alt Hacker News

Replies