dockerTools image sizes are absurd. How to improve?

I’ve been attempting to replace my Dockerfile-based containers with ones built from Nix’s dockerTools. However I am quite dissapointed by the sizes of said images, and clueless as for how to improve them.

Goal:
Put one or two binaries from nixpkgs (eg. nodejs) into a container using Nix dockerTools functions.
Issues:

  • These binaries depend on libc. In Nix, glibc alone is 31Mb plus its other deps, requiring a whopping 34Mb to be added to any binary that depends on libc (see for yourself: nix-store -qR $(nix eval --raw nixpkgs#glibc.outPath) | xargs -I{} du -sh {}). Yikes. Contrast this with alpine (uses musl libc) or google’s “distroless” (uses glibc), whose base images are ~5Mb and ~3-4Mb respectively.
  • While the contents of a dockerTools image with just nodejs added does align precisely with the deps of nodejs according to dive (nix-store -qR $(nix eval --raw nixpkgs#nodejs.outPath)), some of nodejs’s deps seem totally absurd, like pulling in ~8Mb of just header files from libuv-dev, zlib-dev, openssl-dev, and icu4c-dev, the latter of which also depends on coreutils for some reason (something to do with the install command according to nix why-depends??), which pulls in another 1.5Mb - not used whatsoever! (its a very similar situation with the other binaries I’ve tried so far, like git)

Combined, these added dependencies make an enormous size difference:

$ docker image ls
REPOSITORY        TAG        SIZE
test-nix          latest     169MB
test-dockerfile   latest     63.7MB

So, my questions are:

  • What, if anything, can I do to reduce the size of dockerfiles generated via Nix dockerTools?
  • Is this an unavoidable side effect of Nix’s runtime dependency detection in general, or is there room for improvement in how these binaries (nodejs, git) are packaged in nixpkgs?

Other things I have considered:

  • Using pkgsMusl or pkgsStatic. Might reduce size, but require compiling the entire universe which I’m not gonna do
  • Ignoring the added size, and hoping that glibc stuff becomes a shared common layer in all my containers due to buildLayeredImage’s layering algorithm? I’m not sure how to avoid sending duplicate layers between servers though.

For clarity sake, here’s what I used to generate the test-nix and test-dockerfile images with nodejs:

{ dockerTools, buildEnv, nodejs, ... }:
dockerTools.buildImage {
  name = "test-nix";
  copyToRoot = buildEnv {
    name = "image-root";
    paths = [ nodejs ];
    pathsToLink = "/bin";
  };
  config.Cmd = [ "/bin/nodejs" "something" ];
}
FROM alpine:latest
RUN apk add --no-cache nodejs
1 Like

Some other posts/resources I’ve seen (still no clear solutions though):

You can build your stuff with musl, it might just be extra work. I know some people who also build fully static executables to minimise image size. Also extra work.

Re node, use nodejs-slim if you want less clutter.

2 Likes

Non-dev outputs definitely should not depend on dev, unless it is some kind of developer tool or something.

Fixing that in Nixpkgs would be preferable but it can be hard sometimes (see e.g. the PostgreSQL split).

3 Likes

I saw that but in this case I need npm as well and nodejs-slim strips that out :P (I’m also now realizing alpine separates node and npm into separate packages so apk add nodejs npm would’ve been slightly fairer)

In general: reduce the closure size of what you stuff into the container.

This usually requires some deeper overrides or an overlay.

Nixpkgs is optimized for convenience first, only then for closure size.

To achieve minimal closure sizes, you will have to rebuild the world.

Perhaps you find some acceptable middle ground.

In my Erlang journey a reduction from over 200 MiB to little less than 100 MiB was already that acceptable middle ground. The another 30 to 40 MiB I shaved much later have just been the cherry on top.

It’s not really convenient, but this is something you can explicitly (manually) do with nix2container. See https://blog.eigenvalue.net/2023-nix2container-everything-once/.

2 Likes

I’m maybe repeating after @NobbZ, but I’d phrase this as: split outputs and remove references to what’s not used at runtime. If you’re lucky it’ll be something you’d be able to contribute into Nixpkgs, because overlays are more expensive in terms of maintenance and rebuilds. As mentioned above, keeping things working takes priority in Nixpkgs over closure sizes, which might feel demotivating, but note that the tools (the concepts of outputs, references, the overlays and the overrides, various buildLayeredImage implementations, etc) for reducing the closure and image sizes are all in place. These tools permit that you keep doing things the “immutable” way, which may be preferable if you plan to work a project for a long time or in a larger team.

As another example that these tools work, pytorch with cuda used to have a closure of about 20G a couple years back, but today, iirc, it’s up to 1.5-3x smaller uncompressed than the respective NGC images (I might be comparing uncomprossed to compressed right now, need to check)

P.S. I see that half glibc.out’s size is share/locales, but I’m very skeptical that this would be worth the bother to split out.

P.P.S.

❯ sudo podman pull nvcr.io/nvidia/pytorch:23.12-py3
❯ sudo podman image ls
REPOSITORY               TAG                       IMAGE ID      CREATED        SIZE
nvcr.io/nvidia/pytorch   23.12-py3                 1cff6923bda0  7 months ago   22.2 GB
...
❯ sudo podman image save nvcr.io/nvidia/pytorch:23.12-py3 -o pytorch.tar
❯ gzip pytorch.tar
❯ du -hs pytorch.tar.gz 
9.9G    pytorch.tar.gz
1 Like