A faster dockerTools.buildImage prototype

roberth · January 2, 2022, 12:37pm

Correct me if i’m wrong, but each time you load an image with streamLayeredImage, you need to recompute the digest of all layers.

This is correct and it is a solution for the non-reproducable hash problem. So the three possible solutions are:
a. Just fail when the image paths aren’t reproducible
b. Compute the digest at the last moment. Representation is compact. Image is always loadable. Slight unreproduciblities can creep in when a different build of any path is used.
c. Compute the digest in advance and store the tar. Representation is inefficient. Image is always loadable. Unreproducibilities can creep in, but only when the image representation (json) is built more than once.

streamLayeredImage is (b). Your prototype is (a) and you’ve suggested (c) as opt-in.

so when Skopeo pushes an image, it can immediately skips already pushed layers

This is nice when using a remote DOCKER_HOST.

if you develop with containers, containers need to be rebuilt/reloaded as fast as possible (cc Arion ).

It’s hard to beat using the host store for development. You only have to load the “customization layer” ie non-store paths.

In terms of user experience, i’m not sure streaming a tar on stdout is really convenient.

I have had no issue with this, but that could be because arion tracks the metadata for me.

Also, i would like to reduce the amount of container custom code in nixpkgs: Skopeo has all stuffs to push/load images: it would be nice to reuse this instead of rewriting it.

Absolutely.

you could unify these representations and refactor streamLayeredImage to expose the json representation

… or you could do it the other way around, rebase streamLayeredImage onto your code. That would be a small wrapper and you’ll get a pretty good test suite for free. Also upcoming stuff like using NixOS modules to create images.
Maintaining dockerTools has been 80% “please add a test”