LLMs produce deterministic results? Now, that's a big [citation needed]. Where can I find the specs?
Edit: This is assuming by "deterministic," you mean the same thing I said about programming language implementations being "controllable, reproducible, and well-defined." If you mean it produces random but same results for the same inputs, then you haven't made any meaningful points.
I'd recommend learning how transformers work, and the concept of temperature. I don't think I need to cite information that is broadly and readily available, but here:
https://medium.com/google-cloud/is-a-zero-temperature-determ...
I also qualified the requirement of needing the same hardware, due to FP shenanigans. I could further clarify that you need the same stack (pytorch, tensorflow, etc)