logoalt Hacker News

simonwtoday at 3:46 AM2 repliesview on HN

That's why the major AI labs are really careful about the code they include in the training runs.

The days of indiscriminately scraping every scrap of code on the internet and pumping it all in are long gone, from what I can tell.


Replies

jacquesmtoday at 4:33 AM

Well, if as the OP points out it is 'all garbage' they don't have a whole lot of choice to discriminate.

fookertoday at 3:48 AM

Do you have pointers to this?

Would be a great resource to understand what works and what doesn't.

show 1 reply