Are pixels really the best way to encode position at this point?
It strikes me as odd that boxes are placed precisely using pixels, but the size of text is not specified, as far as I can tell. So you use real pixels to specify boxes, but still can't render a canvas exactly/consistently?
I’m playing with 3d positions derived from higher dimensions, right now.
Agreed.
The upside is that it does not leave the most important aspect open to interpretation.
But it prevents this from being text-only at the point of creation:
You'll most likely need some programmatic environment to create non-trivial diagrams.
But then the question is: Why not just an SVG instead?