> The Go compiler is written in Go, for example!
Do you know how they avoid the GC in the Go implementation of the Go compiler? If I understand correctly they need to implement the Go garbage collector in their Go implementation of the Go compiler. But Go already has a garbage collector. So how do they avoid invoking Go's garbage collector so that they can implement the garbage collector of the Go language they are implementing?
Not sure if I'm making sense but I'd like to know more about this from those who understand this more than I do.
Remember that a compiler generates an executable file (can almost be thought of as an ASM transpiler), this file must contain everything the language needs to operate (oversimplification) so that includes the runtime as well as the compiled instructions from the user's code. This is compared to an interpreter which doesn't require you to pack all the implementation details into a binary, so instead you can use the host language's runtime.
All this to say: the output of a compiler is by necessity not tied to the language the compiler is written in, instead it is tied to the machine the executable should run on. A compiler "merely" translates instructions from a high level language to a machine executable one. So stuff like a GC must be coded, compiled and then "injected" into the binary so the user's code can interact with it. In an interpreted language this isn't necessary, since the host language is already running and contains these tools which would otherwise have to be injected into the binary.
They just use the implementation from the last version of the compiler, which you can follow back in a long chain to the first implementation. As for the implementation of the garbage collector, it probably just doesn't allocate anything. The basics of a garbage collector are a function "alloc" and another one "collect". The function to allocate memory usually looks something like this:
char heap[100000000];
int heap_end;
void *alloc(int n_bytes) {
void *out = &heap[heap_end]
heap_end += n_bytes;
return out;
}
As you can see, it doesn't need to allocate any memory to do this.How does clang, a C++ compiler that is itself written in C++, use <feature from C++> that it is itself implementing?
Why wouldn’t it be able to?
I don’t understand how your question specifically relates to garbage collection, or why the compiler would need to avoid it. The Go compiler is a normal Go program and garbage collection works in it the same way it does in any other Go program.
I’ve never used Go myself, but according to this https://go.dev/doc/install/source you need a Go compiler to compile Go. However, for the early versions, you needed a C compiler to compile Go.
So at some point, someone wrote enough of a Go GC in C to support enough of Go to compile itself.
I don't understand the question as it's written.
But the shape of the question feels like you're asking about whether an interpreter (which the compiler is not) uses the GC of the host language?
Definitely not making sense. Other answers appear to assume you don't know what a compiler is, but I'm not so sure. Re-state the question perhaps?
We can think of the Compiler as a function from a string to a string - high level (HLC), to low level code(LLC). LLC can include the garbage collection code(if it is run as a standalone executable instead of garbage collection being done by a separated runtime).
The compiler executable itself is running in a compilation process P which uses memory and has its own garbage collection. (The compiler executable was itself generated by a compilation, using a compiler written in Go itself(self-hosting) or initially, in another language).
But the compilation process P is unrelated to the process Q in which the generated code, LLC, will run when first executed. The OS which runs LLC doesn't even know about the compiler - LLC is just another binary file. The garbage collection in P doesn't affect garbage collection in Q.
Indeed, it should be easy for the compiler to generate an assembly program which constantly keeps allocating more memory until the system runs out, while compiling say a loop which allocates a struct within a loop running a billion times. Unless, of course, you explicitly also generate a garbage collector as part of the low level code.
Your question does become very interesting in the realm of security, there is a famous paper called "Trusting Trust" where a compiled compiler can still have backdoors even if the compiled code is trustworthy and the compiler code is trustworthy but the code which compiled the compiler had backdoors.