logoalt Hacker News

d_burfoottoday at 2:44 AM0 repliesview on HN

> What possible use could there be for doing this?

The point is to generate an enormous unlabeled dataset. Historically, ML for medical imaging depended on a small number of labeled images - small because you needed to have an expert study the image and label it as healthy/cancer/etc. But the "GPT breakthrough" was that it was better to use vast unlabeled datasets - in the case of LLMs, text - than small labeled ones.