logoalt Hacker News

aspenmartinlast Wednesday at 7:12 PM1 replyview on HN

Youre mixing up several concepts. Synthetic data works for coding because coding is a verifiable domain. You train via reinforcement learning to reward code generation behavior that passes detailed specs and meets other deseridata. It’s literally how things are done today and how progress gets made.


Replies

zwnowlast Wednesday at 8:53 PM

Most code out there is a legacy security nightmare, surely its good to train on that.

show 2 replies