We rewrote our Rust WASM parser in TypeScript and it got faster

137 points • by zahlekhan • yesterday at 9:48 PM • 77 comments • view on HN

Comments

Something not unlike this happened to me when moving some batch processing code from C++ to Python 1.4 (this was 1997). The batch started finishing about 10x faster. We refused to believe it at first and started looking to make sure the work was actually being done. It was.

The port had been done in a weekend just to see if we could use Python in production. The C++ code had taken a few months to write. The port was pretty direct, function for function. It was even line for line where language and library differences didn't offer an easier way.

A couple of us worked together for a day to find the reason for the speedup. Just looking at the code didn't give us any clues, so we started profiling both versions. We found out that the port had accidentally fixed a previously unknown bug in some code that built and compared cache keys. After identifying the small misbehaving function, we had to study the C++ code pretty hard to even understand what the problem was. I don't remember the exact nature of the bug, but I do remember thinking that particular type of bug would be hard to express in Python, and that's exactly why it was accidentally fixed.

We immediately started moving the rest of our back end to Python. Most things were slower, but not by much because most of our back end was i/o bound. We soon found out that we could make algorithmic improvements so much more quickly, so a lot of the slowest things got a lot faster than they had ever been. And, most importantly, we (the software developers) got quite a bit faster.

➕ show 4 replies

blundergoat • yesterday at 9:58 PM

The real win here isn't TS over Rust, it's the O(N²) -> O(N) streaming fix via statement-level caching. That's a 3.3x improvement on its own, independent of language choice. The WASM boundary elimination is 2-4x, but the algorithmic fix is what actually matters for user-perceived latency during streaming. Title undersells the more interesting engineering imo.

➕ show 7 replies

nine_k • yesterday at 10:28 PM

"We rewrote this code from language L to language M, and the result is better!" No wonder: it was a chance to rectify everything that was tangled or crooked, avoid every known bad decision, and apply newly-invented better approaches.

So this holds even for L = M. The speedup is not in the language, but in the rewriting and rethinking.

➕ show 4 replies

evmar • yesterday at 10:58 PM

By the way, I did a deeper dive on the problem of serializing objects across the Rust/JS boundary, noticed the approach used by serde wasn’t great for performance, and explored improving it here: https://neugierig.org/software/blog/2024/04/rust-wasm-to-js....

➕ show 1 reply

horacemorace • today at 3:47 AM

I’m more of a dabbler dev/script guy than a dev but Every. single. thing I ever write in javascript ends up being incredibly fast. It forces me to think in callbacks and events and promises. Python and C (or async!) seem easy and sorta lazy in comparison.

spankalee • yesterday at 10:30 PM

I was wondering why I hadn't heard of Open UI doing anything with WASM.

This new company chose a very confusing name that has been used by the Open UI W3C Community Group for over 5 years.

https://open-ui.org/

Open UI is the standards group responsible for HTML having popovers, customizable select, invoker commands, and accordions. They're doing great work.

simonbw • today at 3:03 AM

Yeah if you're serializing and deserializing data across the JS-WASM boundary (or actually between web workers in general whether they're WASM or not) the data marshaling costs can add up. There is a way of sharing memory across the boundary though without any marshaling: TypedArrays and SharedArrayBuffers. TypedArrays let you transfer ownership of the underlying memory from one worker (or the main thread) to another without any copying. SharedArrayBuffers allow multiple workers to read and write to the same contiguous chunk of memory. The downside is that you lose all the niceties of any JavaScript types and you're basically stuck working with raw bytes.

You still do get some latency from the event loop, because postMessage gets queued as a MacroTask, which is probably on the order of 10μs. But this is the price you have to pay if you want to run some code in a non-blocking way.

sakesun • today at 3:22 AM

I heard a lot of similar stories in the past when I start using Python 20+ years ago. A number of people claim their solutions got faster when develop in Python, mainly because Python make it easier to quickly pivot to experiment with various alternative methods, hence finally yield at more efficient outcome at the end.

vmsp • today at 12:45 AM

Not directly related to the post but what does OpenUI do? I'm finding it interesting but hard to understand. Is it an intermediate layer that makes LLMs generate better UI?

jeremyjh • today at 12:18 AM

> The openui-lang parser converts a custom DSL emitted by an LLM into a React component tree.

> converts internal AST into the public OutputNode format consumed by the React renderer

Why not just have the LLM emit the JSON for OutputNode ? Why is a custom "language" and parser needed at all? And yes, there is a cost for marshaling data, so you should avoid doing it where possible, and do it in large chunks when its not possible to avoid. This is not an unknown phenomenon.

joaohaas • yesterday at 11:52 PM

God I hate AI writing.

That final summary benchmark means nothing. It mentions 'baseline' value for the 'Full-stream total' for the rust implementation, and then says the `serde-wasm-bindgen` is '+9-29% slower', but it never gives us the baseline value, because clearly the only benchmark it did against the Rust codebase was the per-call one.

Then it mentions: "End result: 2.2-4.6x faster per call and 2.6-3.3x lower total streaming cost."

But the "2.6-3.3x" is by their own definition a comparison against the naive TS implementation.

I really think the guy just prompted claude to "get this shit fast and then publish a blog post".

nallana • yesterday at 11:48 PM

Why not a shared buffer? Serializing into JSON on this hot path should be entirely avoidable

➕ show 2 replies

slopinthebag • today at 12:16 AM

This article is obviously AI generated and besides being jarring to read, it makes me really doubt its validity. You can get substantially faster parsing versus `JSON.parse()` by parsing structured binary data, and it's also faster to pass a byte array compared to a JSON string from wasm to the browser. My guess is not only this article was AI generated, but also their benchmarks, and perhaps the implementation as well.

➕ show 1 reply

envguard • today at 12:30 AM

The WASM story is interesting from a security angle too. WASM modules inheriting the host's memory model means any parsing bugs that trigger buffer overreads in the Rust code could surface in ways that are harder to audit at the JS boundary. Moving to native TS at least keeps the attack surface in one runtime, even if the theoretical memory safety guarantees go down.

kennykartman • today at 12:41 AM

I dream of the day in which there is no need to pass by JS and Wasm can do all the job by itself. Meanwhile, we are stuck.

marcosdumay • today at 12:39 AM

It would be great if people stopped dismissing the problem that WASM not being a first-class runtime for the web causes.

owenpalmer • today at 1:50 AM

So this is an issue with WASM/JS interop, not with Rust per se?

dmix • yesterday at 10:25 PM

That blog post design is very nice. I like the 'scrollspy' sidebar which highlights all visible headings.

Claude tells me this is https://www.fumadocs.dev/

➕ show 1 reply

ivanjermakov • yesterday at 11:48 PM

Good software is usually written on 2nd+ try.

caderosche • yesterday at 10:33 PM

What is the purpose of the Rust WASM parser? Didn't understand that easily from the article. Would love a better explanation.

➕ show 1 reply

nssnsjsjsjs • yesterday at 11:54 PM

Rewrite bias. Yoy want to also rewrite the Rust one in Rust for comparison.

➕ show 1 reply

measurablefunc • today at 3:01 AM

I tried a similar experiment recently w/ FFT transform for wav files in the browser and javascript was faster than wasm. It was mostly vibe coded Rust to wasm but FFT is a well-known algorithm so I don't think there were any low hanging performance improvements left to pick.

neuropacabra • yesterday at 11:38 PM

This is very unusual statement :-D

szmarczak • yesterday at 11:28 PM

> Attempted Fix: Skip the JSON Round-Trip > We integrated serde-wasm-bindgen

So you're reinventing JSON but binary? V8 JSON nowadays is highly optimized [1] and can process gigabytes per second [2], I doubt it is a bottleneck here.

[1] https://v8.dev/blog/json-stringify [2] https://github.com/simdjson/simdjson

➕ show 1 reply

slowhadoken • yesterday at 11:14 PM

Am I mistaken or isn’t TypeScript just Golang under the hood these days?

➕ show 2 replies

derodero24 • today at 2:42 AM

[dead]

DaleBiagio • today at 12:28 AM

[dead]

dualblocksgame • today at 12:40 AM

[dead]

aimarketintel • today at 2:12 AM