logoalt Hacker News

geremiiahtoday at 11:47 AM1 replyview on HN

Not it's not like saying that at all.


Replies

qseratoday at 12:57 PM

Yes, that is exactly like it.

Generating new training data from existing data will only generate patterns that already exist in the training data. It might help LLMs to capture it if has not already. But it can never generate novel patterns.

For example, Imagine some kind of neural network architecture that can do OCR. You might be able to generate variations of the letters it already know using some technique and use it to better train the recognition of already known letters.

But it would never be able to generate letters that it does not know.