logoalt Hacker News

antonvsyesterday at 6:53 PM0 repliesview on HN

> my understanding is that setting temperature=0, top_p=1 would cause deterministic output (identical output given identical input).

That's typically correct. Many models are implemented this way deliberately. I believe it's true of most or all of the major models.

> Is this universally correct or is it dependent on model used?

There are implementation details that lead to uncontrollable non-determinism if they're not prevented within the model implementation. See e.g. the Pytorch docs for CUDA convolution determinism: https://docs.pytorch.org/docs/stable/notes/randomness.html#c...

That documents settings like this:

    torch.backends.cudnn.deterministic = True 
Parallelism can be a source of non-determinism if it's not controlled for, either implicitly via e.g. dependencies or explicitly.