The RLHF is what creates these anomalies. See delve from kenya and nigeria. Interestingly, because...

djoldman • today at 12:09 AM • 0 replies • view on HN

The RLHF is what creates these anomalies. See delve from kenya and nigeria.

Interestingly, because perplexity is the optimization objective, the pretrained models should reflect the least surprising outputs of all.

alt Hacker News