logoalt Hacker News

abelangeryesterday at 3:52 PM1 replyview on HN

The "constraining functions to only be durable" idea is really interesting to me and would solve the main gotcha of the article.

It'd be an interesting experiment to take memory snapshots after each step in a workflow, which an API like Firecracker might support, but likely adds even more overhead than current engines in terms of expense and storage. I think some durable execution engines have experimented with this type of system before, but I can't find a source now - perhaps someone has a link to one of these.

There's also been some work, for example in the Temporal Python SDK, to overwrite the asyncio event loop to make regular calls like `sleep` work as durable calls instead, to reduce the risk to developers. I'm not sure how well this generalizes.


Replies

vouwfietsmanyesterday at 5:40 PM

Ok, I'm not an expert here, you most likely are, but just my 2 cents on your response: I would very much argue to not make this magic. e.g:

> take memory snapshots after each step in a workflow

Don't do this. Just give people explicit boundaries of where their snapshots occur, and what is snapshotted, so they have control both over durability and performance. Make it clear to people that everything should be in the chain of command of the snapshotting framework: e.g no file-local or global variables. This is already how people program web services but somehow nobody leans into it.

The thing is, if you want people to understand durability but you also hide it from them, it will actually be much more complicated to understand and work with a framework.

The real golden ticket I think is to make readable intuitive abstractions around durability, not hide it behind normal-looking code.

Please steal my startup.

show 3 replies