Wouldn't seeding the RNG used to pick the next token be more configurable? How would changing the hardware/other software make a difference to what comes out of the model?
> Wouldn't seeding the RNG used to pick the next token be more configurable?
Sure, that would work.
> How would changing the hardware/other software make a difference to what comes out of the model?
Floating point arithmetic is not entirely consistent between different GPUs/TPUs/operating systems.
> Wouldn't seeding the RNG used to pick the next token be more configurable?
Sure, that would work.
> How would changing the hardware/other software make a difference to what comes out of the model?
Floating point arithmetic is not entirely consistent between different GPUs/TPUs/operating systems.