I'm pretty sure that the determinism issue is at the floating point math level, or even the har...

danpalmer • today at 3:47 AM • 2 replies • view on HN

I'm pretty sure that the determinism issue is at the floating point math level, or even the hardware level. Just disabling batching and reducing the temperature to 0 does not result in truly deterministic answers.

Replies

orbital-decay • today at 4:02 AM

FP math itself is deterministic on real hardware, if the order of operations stays the same. Output reproducibility is much less of a problem than it seems, see for example https://docs.vllm.ai/en/latest/usage/reproducibility/

nnevatie • today at 4:53 AM

The FP math is deterministic. However, the environments in which inference is run and specifically batching make current LLM services practically non-deterministic.

alt Hacker News

Replies