logoalt Hacker News

djoldmantoday at 12:09 AM0 repliesview on HN

The RLHF is what creates these anomalies. See delve from kenya and nigeria.

Interestingly, because perplexity is the optimization objective, the pretrained models should reflect the least surprising outputs of all.