Ah, I think I found the reason as to why WebAssembly (in a browser or some other sandboxed environment) is not a suitable substrate for near native performance. It is a very ironic reason: you can't implement a JIT compiler that targets WebAssembly in a sandbox running in WebAssembly. Sounds like an incredibly contrived thing to do but once speed is the goal then a copy-and-patch compiler is a valid strategy for implementing a interpreter or a modern graphics pipeline.
> you can't implement a JIT compiler that targets WebAssembly in a sandbox running in WebAssembly
That's not completely true. With dynamic linking (now supported in WASIX), you can generate and link Wasm modules at runtime easily.