> The technical fix was embarrassingly simple: stop pushing to main every ten minutes. Wait, yo...

yokuze • last Saturday at 11:50 AM • 7 replies • view on HN

> The technical fix was embarrassingly simple: stop pushing to main every ten minutes.

Wait, you push straight to main?

> We added a rule — batch related changes, avoid rapid-fire pushes. It's in our CLAUDE.md (the governance file that all our AI agents follow):

> Avoid rapid-fire pushes to main — 11 pushes in 2h caused overlapping Kamal deploys with concurrent SQLite access.

Wait, you let _Claude_ push your e-commerce code straight to main which immediately results in a production deploy?

Replies

chasil • today at 3:05 PM

This is the actual problem:

"Kamal runs blue-green deploys — it starts a new container, health-checks it, then stops the old one. During the switchover, both containers are running. Both mount ultrathink_storage. Both have the SQLite files open."

WAL mode requires shared access to System V IPC mapped memory. This is unlikely to work across containers.

In case anybody needs a refresher:

https://en.wikipedia.org/wiki/Shared_memory

https://en.wikipedia.org/wiki/CB_UNIX

https://www.ibm.com/docs/en/aix/7.1.0?topic=operations-syste...

➕ show 6 replies

crabmusket • last Saturday at 12:03 PM

Patient: doctor, my app loses data when I deploy twice during a 10 minute interval!

Doctor: simply do not do that

➕ show 2 replies

xnorswap • today at 2:47 PM

I'm fairly confident they let it write the blog post too.

➕ show 2 replies

bombcar • today at 2:46 PM

Hey, Apple still takes their store down during product launches!

➕ show 1 reply

littlestymaar • today at 6:12 PM

> Wait, you let _Claude_ push your e-commerce code straight to main which immediately results in a production deploy?

Yikes. Thank you I'm not going to read “Lessons learned” by someone this careless.

➕ show 1 reply

tensegrist • today at 2:43 PM

i hate to be so blunt but look around the site and then tell me you're surprised

burnt-resistor • today at 6:41 PM

I suspect they don't wear helmets or seatbelts either. Sigh. The "I'm so proud and ignorant of unnecessarily risky behaviors" meme is tiring.

The Meta dev model of diff reviews merge into main (rebase style) after automated tests run is pretty good.

Also, staging and canary, gradual, exponential prod deployment/rollback approaches help derisk change too.

Finally, have real, tested backups and restore processes (not replicated copies) and ability to rollback.

alt Hacker News

Replies