What we also learned after GPT-3.5 is that, to circumvent the need for new training data, we could s...

spidersouris • yesterday at 9:11 AM • 0 replies • view on HN

What we also learned after GPT-3.5 is that, to circumvent the need for new training data, we could simply resort to existing LLMs to generate new, synthetic data. I would not be surprised if the em dash is the product of synthetically generated data (perhaps forced to be present in this data) used for the training of newer models.

alt Hacker News