Interesting. I see papers where researchers will finetune models in the 7 to 12b range and even beat...

codemog • yesterday at 8:51 AM • 1 reply • view on HN

Interesting. I see papers where researchers will finetune models in the 7 to 12b range and even beat or be competitive with frontier models. I wish I knew how this was possible, or had more intuition on such things. If anyone has paper recommendations, I’d appreciate it.

Replies

stavros • yesterday at 10:08 AM

They're using a revolutionary new method called "training on the test set".

➕ show 1 reply

alt Hacker News

Replies