logoalt Hacker News

Super-Flat ASTs

90 pointsby mmphosis12/04/202522 commentsview on HN

Comments

pedroziegyesterday at 7:23 PM

What I like about this writeup is that it surfaces a tension most “let’s build a compiler” tutorials skip: the AST is both a data structure and a UX boundary. Super-flat layouts are fantastic for cache and memory, but they’re hostile to all the things humans care about (debuggable shapes, easy instrumentation, ad-hoc traversals, “just print this node and its children” in a debugger). A lot of production compilers quietly solve that by having two tiers: a nice, inefficient tree for diagnostics and early passes, and increasingly flattened / interned / arena-allocated forms as you move toward optimization and codegen.

The interesting question for me is where the crossover is now that IDEs and incremental compilation dominate the workload. If your front-end is effectively a long-running service, it might be worth keeping a friendlier AST around and only using a super-flat representation for hot paths like analysis passes or bulk refactors. Otherwise you risk saving a few hundred MB while spending engineer-months making every new pass fight the layout.

show 1 reply
mitchellhyesterday at 6:03 PM

For a good example of this sort of pattern in the real world, take a look at the Zig compiler source code. I'm sure others might do it but Zig definitely does. I have a now very outdated series on some of the Zig internals: https://mitchellh.com/zig/parser And Andrew's old DoD talk is very good and relevant to this: https://vimeo.com/649009599

More generally, I believe its fair to call this a form of handle-based designs: https://en.wikipedia.org/wiki/Handle_(computing) Which are EXTREMELY useful for a variety of reasons and imo woefully underused above the lowest system level.

show 1 reply
munificentyesterday at 7:29 PM

It looks like, overall, this design gets the parser about twice as fast as a simple one that creates tree-like ASTs.

That's not nothing. But a parser is rarely the most time-intensive part of a production compiler. And the parser does get iterated on a lot in languages that are evolving and adding new syntax.

Given that, I'd be inclined to take the performance hit and stick with a simpler AST representation if that yields a more hackable, maintainable compiler front end.

show 3 replies
shooyesterday at 11:49 PM

It'd also have been interesting to see some overall profiling data of the initial program & some discussion of which optimisations to investigate based on that profiling data.

When investigating performance issues its often very helpful to run with profiling instrumentation enabled and start by looking at some top-down "cumulative sum" profiler output to get a big picture view of which functions/phases are consuming most of the running time, to see where it may be worth spending some effort.

Getting familiar with linux's perf [1] tool is also helpful, both in terms of interpreting summary statistics from perf stat (instructions per cycle, page faults, cache misses, etc) that can give clues what to focus on, but also being able to use it to annotate source line by line with time spent.

I'm not familiar with rust, but e.g. the rustc compiler dev guide has a tutorial on how to profile rustc using perf [2]

[1] Brendan Gregg's Linux perf examples is an excellent place to start https://www.brendangregg.com/perf.html [2] https://rustc-dev-guide.rust-lang.org/profiling/with_perf.ht...

mediumdeviationyesterday at 7:51 PM

For anyone confused by why the text says the performance is improving between each graph but the lines don't seem to show that - the color for each key and the scale changes between graphs.

loegyesterday at 8:44 PM

FWIW I think Clang IR does something like this in a lot of places. It is relatively common to see child nodes stored inline following parent nodes. The APIs more or less abstract this away from consumers like static analysis tools, though.

E.g., https://github.com/llvm/llvm-project/blob/62e00a03fba029f82d...

and

https://github.com/llvm/llvm-project/blob/62e00a03fba029f82d...

vatsachaktoday at 3:52 AM

We need to flatten everything. Thanks for the great write up

jauntywundrkindtoday at 5:54 AM

I want to find some way to link this work to Sea of Nodes representations, but I'm a bit out of my depths to try to do so. https://v8.dev/blog/leaving-the-sea-of-nodes

torginusyesterday at 9:26 PM

Personally I think this is a neat trick to organize memory, but don't these kinds of objects packed together in flat buffers bypass the entire lifetime and safety mechanism of Rust?

I mean if you do an off by one error on indices, essentially you are readin the pointer of another node.

show 1 reply