they are deterministic, open a dev console and run the same prompt two times w/ temperature = 0
So why don’t we all use LLMs with temperature 0? If we separate models (incl. parameters) into two classes, c1: temp=0, c2: temp>0, why is c2 so widely used vs c1? The nondeterminism must be viewed as a feature more than an anti-feature, making your point about temperature irrelevant (and pedantic) in practice.
And then the 3rd time it shows up differently leaving you puzzled on why that happened.
The deterministic has a lot of 'terms and conditions' apply depending on how it's executing on the underlying hardware.