logoalt Hacker News

Reinforcement Learning from Human Feedback

59 pointsby onurkanbkrctoday at 12:53 PM5 commentsview on HN

https://arxiv.org/abs/2504.12501


Comments

dangtoday at 6:16 PM

Related. Others?

RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments)

verdvermtoday at 2:47 PM

Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials

show 1 reply
klelattitoday at 1:46 PM

Web version with links, etc:

https://rlhfbook.com/

show 1 reply
iisweetheartiitoday at 2:01 PM

[dead]