logoalt Hacker News

an0malousyesterday at 8:16 PM6 repliesview on HN

Aren’t transformers intrinsically deterministic? I thought the randomness was intentional to make chatbots seem more natural, and OpenAI used to have a seed parameter you could set for deterministic output. I don’t know why that feature isn’t more popular, for the reasons this article outlines


Replies

jkapturyesterday at 9:23 PM

(I'm not an expert. I'd love to be corrected by someone who actually knows.)

Floating-point arithmetic is not associative. (A+B)+C does not necessarily equal A+(B+C), but you can get a performance improvement by calculating A, B, and C in parallel, then adding together whichever two finish first. So, in theory, transformers can be deterministic, but in a real system they almost always aren't.

show 1 reply
janalsncmyesterday at 9:30 PM

Transformers are just a special kind of binary which are run by inference code. Where the rubber meets the road is whether the inference setup is deterministic. There’s some literature on this: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

I don’t think the issue is determinism per se but chaotic predictions that are difficult to rely on.

show 1 reply
solsaneyesterday at 9:51 PM

Well, you could say that about computers in general. I'm assuming you're referring to temperature (or something similar) which can be set to always pick the most probable token. Floats aside, this should be deterministic. But practically I don't think that changes much since adjusting the input slightly can lead to very different output. Also back in the day the temperature helped it avoid cyclic loops

show 1 reply
esafakyesterday at 10:05 PM

The models generate a token distribution. Which one to pick is a choice. One can sample from the distribution, hence the randomness.

bpodgurskyyesterday at 9:32 PM

Strict deterministic output for a given prompt prevents the use of RAG, which increasingly limits the relative utility of a LLM within an organization.

ares623yesterday at 8:40 PM

Maybe it allowed spitting out copyrighted works verbatim