logoalt Hacker News

digitaltreestoday at 7:37 PM0 repliesview on HN

Umm. It’s real development work in real settings with real model output. That is a high quality dataset. The fact that it isn’t good code from elite engineers is confusing what good means in the context of coding agents. First is how to respond to a range of prompts. For that you need diverse real world conversations. Second is the ability to respond with good code. That is about labeling or other data curation after the fact or other training methods. So it’s a downstream consideration