The "alignment tax".
Exactly. Even this paper shows how model creativity significantly drops and the models experience mode collapse like we saw in GANs, but the companies keep using RLHF...
https://arxiv.org/abs/2406.05587
Exactly. Even this paper shows how model creativity significantly drops and the models experience mode collapse like we saw in GANs, but the companies keep using RLHF...
https://arxiv.org/abs/2406.05587