logoalt Hacker News

koolbalast Friday at 7:50 PM6 repliesview on HN

The article says:

> Never Handroll Your Own Two-Phase Commit

And then buried at the end:

> A few notable features of this snippet:

> Limiting number of retries makes the code more complicated, but definitely worth it for user-facing side-effects like emails.

This isn't two-phase commit. This is lock the DB indefinitely while remote system is processing and pray we don't crash saving the transaction after it completes. That locked also eats up a database connection so your concurrency is limited by the size of your DB pool.

More importantly, if the email sends but the transaction to update the task status fails, it will try again. And again. Forever. If you're going to track retries it would have to be before you start the attempt. Otherwise the "update the attempts count" logic itself could fail and lead to more retries.

The real answer to all this is to use a provider that supports idempotency keys. Then when you can retry the action repeatedly without it actually happening again. My favorite article on this subject: https://brandur.org/idempotency-keys


Replies

greener_grasstoday at 9:21 AM

> The real answer to all this is to use a provider that supports idempotency keys. Then when you can retry the action repeatedly without it actually happening again. My favorite article on this subject: https://brandur.org/idempotency-keys

Turtles all the way down?

Let's say you are the provider that must support idempotency keys? How should it be done?

show 1 reply
maxmcdlast Friday at 8:09 PM

Just that row should be locked since it's: "for update skip locked".

I agree the concurrency limitation is kind of rough, but it's kind of elegant because you don't have to implement some kind of timeout/retry thing. You're certainly still exposed to the possibility of double-sending, so yes, probably much nicer to update the row to "processing" and re-process those rows on a timeout.

surprisetalklast Friday at 9:08 PM

Author here! Posting from phone while traveling so sorry for bad formatting.

It was outside of the scope of this essay, but a lot of these problems can be resolved with a mid-transaction COMMIT and reasonable timeouts

You can implement a lean idempotency system within the task pattern like this, but it really depends on what you need and what failures you want to prevent

Thanks for providing more context and safety tips! :)

tracker1last Friday at 10:29 PM

For similar systems I've worked on, I'll use a process id, try/tries and time started as part of the process plucking an item of a db queue table... this way I can have something that resets anything started over N minutes prior that didn't finish, for whatever reason (handling abandoned, broken tasks that are in an unknown state.

One reason to do this for emails, IE a database queue is to keep a log/history of all sent emails, as well as a render for "view in browser" links in the email itself. Not to mention those rare instances where an email platform goes down and everything starts blowing up.

lelanthrantoday at 10:41 AM

> This isn't two-phase commit.

Agreed

> This is lock the DB indefinitely while remote system is processing and pray we don't crash saving the transaction after it completes.

I don't really see the problem here (maybe due to my node.js skillz being less than excellent), because I don't see how it's locking the table; that one row would get locked, though.

show 1 reply
morshu9001last Friday at 8:27 PM

Idempotency is key. Stripe is good about that.