logoalt Hacker News

eperottoday at 6:30 PM1 replyview on HN

gVisor is open-source, and `cuda-checkpoint` is freely available.

gVisor's `runsc checkpoint` subcommand supports a `--save-restore-exec-argv` which lets you specify a program to execute before gVisor starts taking the process snapshot.

You can fill in the blanks from there.


Replies

za_mike157today at 6:33 PM

Us and the team from Modal have been upstreaming things to the GVisor repo (https://github.com/google/gvisor/pulls) in order to make it compatible with cuda-checkpoint and other parts of our system. While we are both contributing fixes and performance improvements we are unfortunately leaving some secret sauce on the side but hopefully it should get most folks to a successful implementation as is