logoalt Hacker News

shaknalast Wednesday at 9:41 AM4 repliesview on HN

I started building a Forth recently, but decided that instead of interpreter or transpiler or whatever, I'd map to bytes in memory and just straight execute them.

This non-optimising JIT has been far, far easier than all the scary articles and comments I've seen led me to believe.

I'm already in the middle of making it work on both Aarch64 and RISC-V, a couple weeks in.


Replies

pjmlplast Wednesday at 11:45 AM

We did a similar approach back in the day, when going through the Tiger language[0], on the Java version.

Our approach was to model the compiler IR into Assembly macros, and follow the classical UNIX compiler build pipeline, thus even though it wasn't the most performant compiler in the world, we could nonetheless enjoy having our toy compiler generate real executables in the end.

[0] - https://www.cs.princeton.edu/~appel/modern

cnitylast Wednesday at 10:57 AM

I did this for WebAssembly WAT (an IR that is syntactically similar to lisp) by mapping the AST for my lisp more or less directly to the WAT IR, then emitting the bytecode from there. It was pretty fun.

simpleuilast Wednesday at 10:07 AM

Very interesting, care to share the source?

show 1 reply
mananaysiemprelast Wednesday at 12:28 PM

I mean, it’s not hard as such, the encodings of some instruction sets are just ass, with 32- and 64-bit x86 as the foremost example and Thumb-2 not far behind it. Also, if you’re dynamically patching existing code, you’ll have to contend with both modern OSes (especially “hardening” patches thereto) making your life harder in bespoke incompatible ways (see: most of libffi) and modern CPUs being very buggy around self-modifying code. Other than that, it just takes quite a bit of tedious but straightforward work to get anywhere.

show 1 reply