I see that the author took a 'heuristical' approach for retrying tasks (having a predeterm...

Felk • 08/01/2025 • 3 replies • view on HN

I see that the author took a 'heuristical' approach for retrying tasks (having a predetermined amount of time a task is expected to take, and consider it failed if it wasn't updated in time) and uses SQS. If the solution is homemade anyway, I can only recommend leveraging your database's transactionality for this, which is a common pattern I have often seen recommend and also successfully used myself:

- At processing start, update the schedule entry to 'executing', then open a new transansaction and lock it, while skipping already locked tasks (`SELECT FOR UPDATE ... SKIP LOCKED`).

- At the end of processing, set it to 'COMPLETED' and commit. This also releases the lock.

This has the following nice characteristics:

- You can have parallel processors polling tasks directly from the database without another queueing mechanism like SQS, and have no risk of them picking the same task.

- If you find an unlocked task in 'executing', you know the processor died for sure. No heuristic needed

Replies

diarrhea • 08/01/2025

This introduces long-running transactions, which at least in Postgres should be avoided.

➕ show 3 replies

renewiltord • 08/01/2025

Don't have to keep transaction open. What I do is:

1. Select next job

2. Update status to executing where jobId = thatJob and status is pending

3. If previous affected 0 rows, you didn't get the job, go back to select next job

If you have "time to select" <<< "time to do" this works great. But if you have closer relationship you can see how this is mostly going to have contention and you shouldn't do it.

alex5207 • 08/01/2025

This is exactly what we're doing. Works like a charm.

alt Hacker News

Replies