LLMs specifically are fine with random bits flipped for the results to be 'creative'.
That's not exactly how LLM temperature works. :). Also that's on inference, not training. Presumably these would be used for training, the latency would be too high for inference.
That's not exactly how LLM temperature works. :). Also that's on inference, not training. Presumably these would be used for training, the latency would be too high for inference.