I walked away from a job interview a few years ago on this point.
One of the technical questions was "if you have a db and a message queue, how do you get your update to alter both or neither (i.e. transactionally)"?
I thought about it for a couple of minutes, then came back with something like "I can't, and you can't either." Then I proposed the usual spiel about using a replicated-state-machine/write-ahead-log/event-sourcing (whatever it might be called at the time) and leaning into eventual consistency as the only practical solution.
He asked if I'd heard about the outbox pattern, so I let him describe it. Sure enough it sounded like this article. The secret to transacting across the database D and the message queue Q:
(D,Q)
is to split D into two parts (the State and the Outbox), transact across those instead (S,O) Q
and then just pretend that you have a transaction across D and Q.With an inbox/outbox pattern it's possible. The incoming message might be processed more than once, and an outgoing message might be sent more than once. That's the limitation, and the system needs to be able to handle it.
If you can't de-duplicate messages it's not possible, that's true.
> Sure enough it sounded like this article
FWIW The article literally talks about the challenges with getting this to actually work and recommends removing it and just using the DB for everything.
Just post to the database then asynch send to message queue. Messages should still be idempotent by the consumer but at least this follows rest and is transactional.
It’s simple and easy to follow. At scale use multi tenancy.
It's a bit of trick that the outbox to queue part of it likely needs to support "at least once but duplicates possible" into the queue.
One major trick in distributed systems is to always attempt things in the same order. And then locally, you just store what you’ve seen, for “a long time”. That takes care of a lot of transactional issues — idempotency, retries, exactly-once delivery with no distributed locks, etc.
But as someone who builds distributed systems, I can tell you that transactions should be local. Anytime you want to lock something across the network (eg Canisters in ICP) so you can read it, that’s probably a code smell. You probably want to have evented reactive things ripple out, you do need idempotency, but you shouldn’t design your system to read remote state if you can help it. Only subscribe to remote messages.
I envy you DB + distributed systems specialists. Reminds me I still have a lot to learn.
The way they worded that question is bad and, as you say, the outbox pattern does not transactionally update the queue itself. The outbox pattern is nevertheless very useful.