Do we fear the serializable isolation level more than we fear subtle bugs (2024)

57 points • by b-man • last Wednesday at 2:21 PM • 31 comments • view on HN

Comments

The article doesn't mention the biggest problem with serializable isolation. At every commit, you need handle the possibility of a serialization exception and retry the transaction. Traditionally devs and frameworks don't, so your application works fine during development and staging but starts failing under load. It makes commit failures normal, rather than an 'oh shit' problem because your disk has filled or someone has tripped over a network cable.

And how do you retry transactions? Then you hit another issue when using multiple datastores, where you need to learn about two-phase commit and the joys of manually keeping datastores in sync that don't support it (eg. filesystems).

And the locks, if you dare run batch updates along with web requests. The long running transactions lock everything they read, blocking short transactions. Because that is exactly what you asked for. Again, you will miss this during development and only notice under load.

So sure, you might avoid some data consistency issues if your data model and update patterns hit the edge cases. In practice, the reason details about serializable are not well known is the cases are rare. Using it gives you safety (maybe that rare case is your case!), but everything needs to be carefully designed around it.

➕ show 2 replies

hyperpape • yesterday at 10:39 PM

> According to the paper, “Of the 22 vulnerabilities, five were level-based, meaning that the default weak isolation level led to the anomalies behind the vulnerabilities. The remaining 17 were scope-based, meaning that the database accesses were not properly encapsulated in transactions and concurrent API requests could trigger the vulnerability independent of the level of isolation provided by the database backend.”

I don't want to commit to a real opinion, but the cynic in me sees a bitter lesson you could take from this is that the database should default to a low isolation level--the damn developers aren't even using transactions right, so why waste performance handling transactions in the strictest possible way?

➕ show 3 replies

lukas221 • yesterday at 9:12 PM

I would argue that not using serialization isolation level by default is like not using a memory safe programming language by default.

Sure, sometimes it's too slow, but it should be the default.

Very few people can write correct database code at the other serialization levels. Most think they can, but it's harder than correct multi-threading, because databases do weird unintuitive things for performance.

➕ show 3 replies

SoftTalker • yesterday at 10:58 PM

You may not need serializable isolation level, but you must understand the concurrency model of your database and the implications of it, and realize that they are not all the same. Oracle, Postgres, MySQL, SQL Server are all different.

➕ show 3 replies

mastermedo • yesterday at 9:34 PM

> Surprisingly, there are many more stories and publications about bugs caused by weak isolation levels than cases where stronger isolation levels caused impractically low performance.

I expected the article to substantiate the claim that serializable brings a large performance hit as in my experience it isn't so. The article basically makes the same point.

With serializable, you need to be a little careful not to have hot rows. Avoid them by sharding commonly written values. Another way to improve performance is to use true time for ordering non read-then-write transactions. It's a little finicky if the database doesn't provide such guarantees out of the box. Take Google's Spanner as an example. It offers the serializable isolation level and it's pretty performant (as long as you account for hot spots).

➕ show 1 reply

sanqui • yesterday at 10:19 PM

I only recently learned about serializable transactions and it seems bonkers that this is not the default. It makes a lot of sense combined with the event sourcing pattern. I believe it allows you to query for state in the decide function and then emit events safely without having to implement aggregates or versioning (aka you have "dynamic consistency boundaries"). The crucial part is that if any of the queried information changes before the event is emitted the transaction fails and business logic has to be retried until you get a conclusive answer.

➕ show 1 reply

zadikian • yesterday at 11:31 PM

Was curious about the Flexcoin hack, but the article wasn't loading, so here's an archive: https://web.archive.org/web/20240423000007/https://hackingdi... Supposedly it was this simple:

  mybalance = database.read("account-number")
  newbalance = mybalance - amount
  database.write("account-number", newbalance)
  dispense_cash(amount)   // or send bitcoins to customer

and MongoDB didn't even have a way to do this atomically? An RDBMS with read-committed would handle this fine if you did "read for update" on that row.

alt Hacker News

Do we fear the serializable isolation level more than we fear subtle bugs (2024)

Comments