logoalt Hacker News

WASM is not quite a stack machine

83 pointsby signa11today at 4:34 AM27 commentsview on HN

Comments

ronin_nirontoday at 12:33 PM

One thing nobody's named in the thread yet: WASM's validator is linear-time, single-pass, with no dataflow joins. That constraint is what gives the operand stack its weird shape. Every block, loop, and if carries a function-type signature. The operand stack at block entry has to match the parameter type, and at exit has to match the result type. Inside a block the validator only sees pushes and pops within that frame; it never has to merge stacks from sibling control-flow paths, because each path's exit type is independently checked against the same expected signature.

JVM went the other way: arbitrary control flow plus a verifier that does dataflow with type merges across joins. That's expensive enough that JVMs do it once at class load and cache the verified state. WASM specifically didn't want that bill; fast startup was a hard requirement.

So the prefix/postfix debate elsewhere in the thread is downstream of this. The encoded form is postfix because that's what trivially admits a linear validator; the textual LISP form is sugar for the same expression trees inside a typed frame. dup isn't missing for aesthetic reasons either: local.tee n followed by local.get n already gives you dup-equivalence through typed locals, and any stack op that didn't reduce to typed locals would either duplicate what locals already do, or break the validator's linearity guarantee.

stevefan1999today at 7:45 AM

I'm trying to implement a WASM to C compiler, and because of that not-quite-so-stack behavior, I can actually guarantee that it will always build an expression and I don't have to discard or reset stack value! Everything stays within that function, which is very neat, and I think it is one of the reason WAT, the textual format is so neat, that you can represent it with a S-Expression.

show 2 replies
Hendriktotoday at 9:51 AM

The series of articles linked at the end (troubles.md/posts/wasm-is-not-a-stack-machine/) is even more interesting, imo.

Very well articulated and concise critique by somebody who seems to have a great amount of knowledge and experience with the topics.

ufotoday at 9:41 AM

The author seems to complain about a lack of stack manip expressions like dup and rot, but at least for me that's what I would expect from an average programming language stack machine. Even Java, which does have those instructions, doesn't use them --- reuse happens via local variables.

The way I see it, the difference between register and stack vms is all about the instruction encoding. Register VMs have fatter instructions in exchange for needing fewer LOAD and STORE operations. Despite the name, register VMs also have a stack.

show 2 replies
asibahitoday at 10:20 AM

I dont really disagree with the main premise of the article, which is that WASM is not really a stack language, but this part just gave me pause:

> In textual Wasm, for example, they are instead represented in a LISP-like notation – not any less or more efficient

The Text format, at least when it comes to instructions, it 1 to 1 with the binary format. The LISP-like syntax is mainly just syntax sugar[1].

    ‘(’ plaininstr  instrs ‘)’ ≡  instrs plaininstr
So (in theory, as far as I understand it) you can just do `(local.get 2 local.get 0 local.get 1)` to mean `local.get 0 local.get 1 local.get 2`, and it works for (almost) any instruction.

Unfortunately, in my limited testing, tools like `wat2wasm` and Binaryen's `wasm-as` don't seem to adhere to (my perhaps faulty understanding of) the spec, and demand all instructions in a folded block be folded and have the "correct" amount of arguments, which makes Binaryen do weird things like

    (return
      (tuple.make     ;; Binaryen only pseudoinstruction
        (local.get 0) ;; or w/e expression
        (local.get 1) ;; or w/e expression
      )
    )

when this is perfectly valid

    local.get 0
    local.get 1
    return

tl;dr: the LISP syntax is just syntax sugar. The textual format is as "stack-like" as the binary format.

Edit: An example that is easily done with the stack syntax and not with lisp syntax is the following:

    call function_that_returns_multivalue
    local.set 2 ;; last return
    local.set 1 ;;
    local.set 0 ;; first return
In LISP syntax this would be

    (local.set 0
      (local.set 1
        (local.set 2
          (call function_that_returns_multivalue
            ( ;; whatever input paramters 
            )))))
I have not yet tried this with Binaryen but I doubt it flies.

[1]: https://webassembly.github.io/spec/core/text/instructions.ht...

show 1 reply
kgtoday at 9:29 AM

The lack of a dup opcode in Wasm as mentioned in the post is quite annoying when trying to generate compact code. I wish something like it had made it into the spec.

show 1 reply
shevy-javatoday at 10:30 AM

I am sad about WASM. It was a promise for epic greatness.

It has failed to deliver that - so much is clear now. You rarely see any awesome success story shown with regard to WASM nowadays. What happened to the old promises? "Electron will be SUPER fast thanks to WASM" or "use any language, WASM unifies it all for the larger browser ecosystem".

It feels as if WASM is on a step towards exctinction. Sure, it is mentioned, it is used, but let's be honest - only few people really use it. And that won't change either.

show 1 reply