Yeah deep learning treats any training data as the absolute god given ground truth and will completely restructure the model to fit the dumbest shit you feed it.
The first LLMs were utter crap because of that, but once you have just one that's good enough it can be used for dataset filtering and everything gets exponentially better once the data is self consistent enough for there to be non-contradictory patterns to learn that don't ruin the gradient.