logoalt Hacker News

yosefktoday at 6:38 AM3 repliesview on HN

"Ironically, among the four stages, the compiler (translation to assembly) is the most approachable one for an AI to build. It is mostly about pattern matching and rule application: take C constructs and map them to assembly patterns.

The assembler is harder than it looks. It needs to know the exact binary encoding of every instruction for the target architecture. x86-64 alone has thousands of instruction variants with complex encoding rules (REX prefixes, ModR/M bytes, SIB bytes, displacement sizes). Getting even one bit wrong means the CPU will do something completely unexpected.

The linker is arguably the hardest. It has to handle relocations, symbol resolution across multiple object files, different section types, position-independent code, thread-local storage, dynamic linking and format-specific details of ELF binaries. The Linux kernel linker script alone is hundreds of lines of layout directives that the linker must get exactly right."

I worked on compilers, assemblers and linkers and this is almost exactly backwards


Replies

stlee42today at 6:45 AM

Exactly this. Linker is threading given blocks together with fixups for position-independent code - this can be called rule application. Assembler is pattern matching.

This explanation confused me too:

  Each individual iteration: around 4x slower (register spilling)
  Cache pressure: around 2-3x additional penalty (instructions do not fit in L1/L2 cache)
  Combined over a billion iterations: 158,000x total slowdown
If each iteration is X percent slower, then a billion iterations will also be X percent slower. I wonder what is actually going on.
vidarhtoday at 7:19 AM

Claude one-shot a basic x86 assembler + linker for me. Missing lots of instructions, yes, but that is a matter of filling in tables of data mechanically.

Supporting linker scripts is marginally harder, but having manually written compilers before, my experience is the exact opposite of yours.

lelanthrantoday at 6:58 AM

I am inclined to agree with you... but, did CC produce a working linker as well as a working compiler?

I thought it was just the compiler that Anthropic produced.

show 1 reply