charles, amit, can you go into more about the path based caching? Particularly "shared bytes aren’t guaranteed to be in the exact same container image layer"? I've built something that solves issues around sharing data between layers, and am interested to see if it fits usecases like Modal's.
Edit: "The solution is to disaggregate the container launcher (runc for Docker, runsc for gVisor) from the container image delivery" is exactly what I've done! I've not built a lazy FUSE on top of it (yet! except for cache mounts in BuildKit), but it's on my TODO list. I guess I'm mainly curious what stops bytes from being shared in your case.
To clarify: we do content-based hashing, and when we say "shared bytes aren’t guaranteed to be in the exact same container image layer", what we mean is that
FROM some/image RUN pip install torch==2.7.1
and
FROM another/image RUN pip install torch==2.7.1
will produce images with very high overlap in contents, which will be shared by a content-based cache, but those images' final layers are disjoint from the perspective of a layerwise cache.