That would reduce the training quality immensely. Besides, any generalist model really needs to remember facts and texts verbatim to stay useful, not just generalize. There's no easy way around that.