Models are deterministic, they're a mathematical function from sequences of tokens to probability distributions over the next token.
Then a system samples from that distribution, typically with randomness, and there are some optimizations in running them that introduce randomness, but it's important to understand that the models themselves are not random.
This is only ideally true. From the perspective of the user of a large closed LLM, this isn't quite right because of non-associativity, experiments, unversioned changes, etc.
It's best to assume that the relationship between input and output of an LLM is not deterministic, similar to something like using a Google search API.
The LLMs are deterministic but they only return a probability distribution over following tokens. The tokens the user sees in the response are selected by some typically stochastic sampling procedure.