alt
Hacker News
hamiecod
•
yesterday at 7:49 AM
•
0 replies
•
view on HN
Thats a strong RL technique that could equal the quality of RLHF.