Are conventional compilers actually deterministic, with all the bells and whistles enabled? PGO seems like it ought to have a random element.
Not at all, when talking about managed runtimes.
Hence why it is hard to do benchmarks with various kinds of GC and dynamic compilers.
You can't even expect deterministic code generation for the same source code across various compilers.
It is generally considered a bug in a compiler if its output is nondeterministic. Of course, compilers are large, complex beasts, and nondeterminism is so easy to accidentally introduce (e.g., do a "for each" in a map where the key is a pointer), that it's probably not too hard to find cases that have nondeterminism.
> PGO seems like it ought to have a random element.
PGO should be deterministic based on the runs used to generate the profile. The runs are tracking information that should be deterministic--how many times does the the branch get taken versus not taken, etc. HWPGO, which relies on hardware counters to generate profiling information, may be less deterministic because the hardware counters end up having some statistical slip to them.
well considering you use components like DFA to build compilers, yes they are determenistic. you also have reproducible builds etc.
or does your binary always come out differently each time you compile the same file??
You can try it. try to compile the same file 10 times and diff the resultant binaries.
Now try to prompt a bunch of LLMs 10 times and diff the returned rubbish.
Yes, they will output the same file hash every time, short of some build time mutation. Thus we can have nice things like reproducible builds and integrity checks.
No, modulo bugs generally the same set of inputs to a compiler are guaranteed to produce the same output bit for bit which is the definition of determinism.
There’s even efforts to guarantee this for many packages on Linux - it’s a core property of security because it lets you validate that the compilation process or environment wasn’t tampered with illicitly by being able to verify by building from scratch.
Now actually managing to fix all inputs and getting deterministic output can be challenging, but that’s less to do with the compiler and more to do with the challenge of completely taking the entire environment (the profile you are using for PGO, isolating paths on the build machine being injected into the binary, programs that have things in their source or build system that’s non deterministic (e.g. incorporating the build time into the binary)