logoalt Hacker News

goldenarmlast Saturday at 10:59 PM3 repliesview on HN

Now that StackOverflow has been killed (in part) by LLMs, how will we train future models? Will public GitHub repos be enough?

Precise troubleshooting data is getting rare, GitHub issues are the last place where it lives nowadays.


Replies

Vaslolast Saturday at 11:03 PM

They would just use documentation. I know there is some synthesis they would lose in the training process but I’m often sending Claude through the context7 MCP to learn documentation for packages that didn’t exist, and it nearly always solves the problem for me.

show 2 replies
gitaariklast Sunday at 3:59 AM

They pay lots of humans to train the LLMs..