It means they have the same levers somewhere in the training process. Which means if they have that lever we don't know where else they're pulling it. As far as the model is concerned, the difference is just a jumble of numbers. Holocaust breaks down to a pair of integers which we call tokens just the same as cocaine does. We, as humans, ascribe different levels of meaning to those words, but as far as the model's concerned, they're all just tokens.
It means they have the same levers somewhere in the training process. Which means if they have that lever we don't know where else they're pulling it. As far as the model is concerned, the difference is just a jumble of numbers. Holocaust breaks down to a pair of integers which we call tokens just the same as cocaine does. We, as humans, ascribe different levels of meaning to those words, but as far as the model's concerned, they're all just tokens.