2 PB? They will not come close to training in on that amount. Maybe years from now.
Think they will not train on the dull 2TB but use that as the data lake to start and then apply a more targeted approach.
Could probably LoRA with that
[dead]
Think they will not train on the dull 2TB but use that as the data lake to start and then apply a more targeted approach.