"Ironically, among the four stages, the compiler (translation to assembly) is the most approachable one for an AI to build. It is mostly about pattern matching and rule application: take C constructs and map them to assembly patterns.
The assembler is harder than it looks. It needs to know the exact binary encoding of every instruction for the target architecture. x86-64 alone has thousands of instruction variants with complex encoding rules (REX prefixes, ModR/M bytes, SIB bytes, displacement sizes). Getting even one bit wrong means the CPU will do something completely unexpected.
The linker is arguably the hardest. It has to handle relocations, symbol resolution across multiple object files, different section types, position-independent code, thread-local storage, dynamic linking and format-specific details of ELF binaries. The Linux kernel linker script alone is hundreds of lines of layout directives that the linker must get exactly right."
I worked on compilers, assemblers and linkers and this is almost exactly backwards
Claude one-shot a basic x86 assembler + linker for me. Missing lots of instructions, yes, but that is a matter of filling in tables of data mechanically.
Supporting linker scripts is marginally harder, but having manually written compilers before, my experience is the exact opposite of yours.
I am inclined to agree with you... but, did CC produce a working linker as well as a working compiler?
I thought it was just the compiler that Anthropic produced.
Exactly this. Linker is threading given blocks together with fixups for position-independent code - this can be called rule application. Assembler is pattern matching.
This explanation confused me too:
If each iteration is X percent slower, then a billion iterations will also be X percent slower. I wonder what is actually going on.