This isn't really true though. Pre-training for coding models is just a mass of scraped source-...

nayroclade • today at 1:39 AM • 2 replies • view on HN

This isn't really true though. Pre-training for coding models is just a mass of scraped source-code, but post-training is more than simply generating compiling code. It includes extensive reinforcement learning of curated software-engineering tasks that are designed to teach what high quality code looks like, and to improve abilities like debugging, refactoring, tool use, etc.

Replies

softwaredoug • today at 1:42 AM

Well and also a lot of Claude Code users data as well. That telemetry is invaluable.

➕ show 1 reply

sarchertech • today at 2:07 AM

There’s no objective measurement for high quality code, so I don’t think model creators are going to be particularly good at screening for it.

alt Hacker News

Replies