logoalt Hacker News

fourteenminutestoday at 5:25 AM1 replyview on HN

As a (very happy) RWX customer:

- Intermediate tasks are cached in a docker-like manner (content-addressed by filesystem and environment). Tasks in a CI pipeline build on previous ones by applying the filesystem of dependent tasks (AFAIU via overlayfs), so you don't execute the same task twice. The most prominent example of this is a feature branch that is up-to-date with main passes CI on main as soon as it's merged, as every task on main is a cache-hit with the CI execution on the feature branch.

- Failures: the UI surfaces failures to the top, and because of the caching semantics, you can re-run just the failed tasks without having to re-run their dependencies.

- Debugging: they expose a breakpoint (https://www.rwx.com/docs/rwx/remote-debugging) command that stops execution during a task and allows you to shell into the remote container for debugging, so you can debug interactively rather than pushing `env` and other debugging tasks again and again. And when you do need to push to test a fix, the caching semantics again mean you skip all the setup.

There's a whole lot of other stuff. You can generate tasks to execute in a CI pipeline via any programming language of your choice, the concurrency control supports multiple modes, no need for `actions/cache` because of the caching semantics and the incremental caching feature (https://www.rwx.com/docs/rwx/tool-caches).

And I've never had a problem with the logs.


Replies

ses1984today at 1:16 PM

The previous post describes a problem where you do a large docker build, then fan out to many jobs which need to pull this image, and the overhead is enormous. This implies rwx has less overhead. Just saying that there’s content addressable cache doesn’t explain how this particular problem is solved.

If you have a dockerfile where you make a small change in your source results in one particular very large layer that has to be built, then you want to fan out and run many parallel tests using that image, what actually happens when you try to run that new fat layer on a bunch of compute, and how is it better than the implied naive solution? That fat layer exists on a storage system somewhere, and a bunch of computer nodes need to read it, what happens?