logoalt Hacker News

hamiecodyesterday at 7:49 AM0 repliesview on HN

Thats a strong RL technique that could equal the quality of RLHF.