It's not about delivering punishment. It's about suppressing certain responses. If the mod...

immibis • last Monday at 12:56 AM • 0 replies • view on HN

It's not about delivering punishment. It's about suppressing certain responses. If the model is trained seeing that responses using don't contain things that previous messages say will be punished then that is a valid way to deprioritize those responses.

alt Hacker News