Just to be the pedant here, LLMs are fully deterministic (the same LLM, in the same state, with the same inputs, will deliver the same output, and you can totally verify that by running a LLM locally). It's just that they are chaotic (a prompt and a second with slight and seemingly minor changes can produce not just different but conflictual outputs).
To pedant it up, not across GPUs.
Even if they weren't chaotic, prompt injection would probably be a problem imho
> Just to be the pedant here, LLMs are fully deterministic ... you can totally verify that by running a LLM locally
To be even more pedantic, this is only true if the LLM is run locally on the same GPU with particular optimizations disabled.