This article both undersells and oversells the technical challenge exchanges solve.
First, it is of course possible to apply horizontal scaling through sharding. My order on Tesla doesn't affect your order on Apple, so it's possible to run each product on its own matching engine, its own set of gateways, etc. Most exchanges don't go this far: they might have one cluster for stocks starting A-E, etc. So they don't even exhaust the benefits available from horizontal scaling, partly because this would be expensive.
On the other hand, it's not just the sequencer that has to process all these events in strict order - which might make you think it's just a matter of returning a single increasing sequence number for every request. The matching engine which sits downstream of the sequencer also has to consume all the events and apply a much more complicated algorithm: the matching algorithm described in the article as "a pure function of the log".
Components outside of that can generally be scaled more easily: for example, a gateway cares only about activity on the orders it originally received.
The article is largely correct that separating the sequencer from the matching engine allows you to recover if the latter crashes. But this may only be a theoretical benefit. Replaying and reprocessing a day's worth of messages takes a substantial fraction of the day, because the system is already operating close to its capacity. And after it crashed, you still need to figure out which customers think they got their orders executed, and allow them to cancel outstanding orders.
> My order on Tesla doesn't affect your order on Apple
not necessarily
many exchanges allow orders into one instrument to match on another
(very, very common on derivatives exchanges)
Once sequencing is done, the matching algorithm can run with some parallelism.
For example, Order A and order B might interact with eachother... but they also might not. If we assume they do not, we can have them processed totally independently and in parallel, and then only if we later determine they should have interacted with each other then we throw away the results and reprocess.
It is very similar to the way speculative execution happens in CPU's. Assume something then throw away the results if your assumption was wrong.