logoalt Hacker News

orbisvicistoday at 1:22 AM3 repliesview on HN

I'm not following. Doesn't the outbox pattern just pass the buck?

The motive seems to be a naive process that enqueues a message and then commits to a database - two independent actions. But a well-behaved process would commit to a database, and then only if successful enqueue a message. That's better but still not atomic - commit, crash, and no message queued.

So the solution is a two-table write - the outbox pattern. But the process that reads the outbox must commit both a query and delete before sending the message. That's the same risk as the agreement well-behaved program - commit, crash, and no message queued. Except now you introduced another pipeline element so your overall complexity increases, and so too risk.

What if you never delete messages from the outbox? Well, what you have now is no longer an outbox nor a database nor useful for large volumes. What if you implement a database to track procesed messages. Return to square one - that's the same problem you were initially trying to solve.

What if you fetch, enqueue, and then delete? Ohh... that works. In case of a crash the message remains in the outbox. It may be processed in duplicate, but eventually if successfully it will be deleted from the outbox.

The message broker then receives a possibly duplicate message. It must consult its internal database, and if the message is unique, route it. So right back at square one. Can't have atomicity and uniqueness.


Replies

KraftyOnetoday at 2:03 AM

Outbox's power is that it turns an atomicity problem into an idempotency problem. You atomically write to the outbox, then you have an idempotent "workflow" that processes events from the outbox. This turns "at most once" semantics (where an event could be dropped entirely) to "at least once" semantics (where the event processing could run multiple times). For many systems, that's a big improvement.

show 1 reply
andixtoday at 2:43 AM

The message is written to the outbox table in the same transaction as the database changes. Only if the transaction completes, the message is actually created, and other tables are updated.

In a second step the message is taken from the outbox and gets sent to the queue/broker. Only after it was sent out, the message is removed from the outbox. If the sending fails, it stays in the outbox and is retried. If the deletion of the message from the outbox fails after sending, it's getting re-sent later. So you can get a duplicated out-message.

Message brokers usually don't de-duplicate messages, they don't have a database that keeps messages, the receivers need to do that. Either with idempotency, or by tracking message ids. Event sourcing brokers can de-duplicate, because it can stores all messages.

If you never delete messages from the outbox, then they are re-sent all the time. You are going to notice such a bug really quickly.

Inbox pattern works very similarly, but the other way around.

atomicnumber3today at 2:05 AM

No, you're right. It basically just passes the buck. But the general idea is that if your transaction succeeds, you KNOW that there is a durable record that some external thing needs to end up in a message bus. And then something else can sit there and spin retries until it happens. It gives you the opportunity for retrying getting it onto the message bus, out of band of the process that is trying to initiate the enqueue.

And the outbox pattern isn't bs - it DOES help a lot in practice. But exactly how much it _guarantees_ something happens is of course still quite limited. And yes as you note it's an At-least-once strategy.