Handling specific functions is optional. If the compiler can compile a function call, it can compile...

kazinator • 10/12/2024 • 1 reply • view on HN

Handling specific functions is optional. If the compiler can compile a function call, it can compile a call to the + or car functions, which then have to be president in the run-time.

Functions like these are obvious targets for special recognition and inlining. Arithmetic code won't be as fast as it could be if every + has to be a call to a function, but it will work.

A compiled Lisp implementation can be bootstrapped to the point where the definition of car in the library looks like:

  (defun car (x) (car x))

And similarly for some other functions. Then there are only two places in the system that know how to actually extract the car field: the compiler source code, and the corresponding compiler executable needed for bootstrapping.

In that case if you remove the car handling from the compiler then the system's self-hosting and bootstrapping ability breaks.

Replies

kragen • 10/12/2024

Of course! That's what I did last time I wrote a Lisp compiler, but this one seems a little more limited. The page explains:

> Finally, comp-funcall compiles code for function calls to the built-in functions, or a recursive call to the main function: (...)

(Emphasis mine.)

And none of his example functions calls any function other than the builtin ones (which, indeed, his compiler does inline) and itself. But it isn't obvious where the limitation comes from; the subroutine call is just a jal instruction to an assembler label, which you would think would work just as well to call another function as to make a recursive call.

Maybe the limitation is that the assembler only assembles a single function at a time, and he doesn't have a link-editing stage or a symbol table implicitly or explicitly shared between assembler calls. In that case it would be fairly simple to extend the compiler to support more general calls, as you reasonably but apparently incorrectly assumed this one already does.

Looking a bit at the assembler http://www.ulisp.com/list?31OE it looks like it only supports jumps to previously defined labels? In $jal and offset I don't see anything that resembles adding a relocation to a list of relocations for a label so it can be backpatched later. But I also don't see how it gets the numerical value for a label that it subtracts *pc* from in offset. In the compiler itself http://www.ulisp.com/list?4Y4Q it seems to be consing up lists of assembly instructions that eventually get evaled, which seems like a kind of janky way to invoke your assembler but whatever, but I don't see where the binding of label names to addresses happens. I can't find anything resembling a symbol table.

I think I'd make a few other changes, though. I'd add closure support and some kind of type tagging; right now it depends on knowing the types are compile time so it's kind of more like a Forth or C compiler with Lisp syntax. And I'd indirect the calls through a runtime-mutable symbol table so you could redefine a function without having to rewrite all the calls to the old function in existing code.

That's what I did last time I wrote a Lisp compiler, anyway; maybe it's not a good tradeoff on today's hardware anymore, since people presumably still only interactively load new definitions a few times a minute at most, but CPUs have gone from a MIPS to a hundred BIPS. So making all your function calls much slower by frustrating the CPU's branch predictor with a PLT in order to speed up relinking after an edit might no longer be a good tradeoff, even if it is what glibc does—glibc doesn't have FASLs.

How are you doing it these days?

➕ show 1 reply

alt Hacker News

Replies