Why would someone use dockertools.buildimage over using a Dockerfile?

samlhuillier · November 5, 2022, 8:21pm

Ostensibly, I can’t seem to see a reason…but I suppose there must be one

NobbZ · November 5, 2022, 8:25pm

dockerTools is reproducible

raphi · November 6, 2022, 9:23am

And much easier to use… no need to specify dependencies, copy files manually, etc. because nix already knows all dependencies!

dockerTools.buildLayeredImage {
  name = "stuff";
  config.Entrypoint = [ "${yourPackage}/bin/stuff" ];
}

TLATER · November 6, 2022, 11:45am

Not only that, but it also generally produces smaller images. Since people always use dockerfiles to also build their software, download dependencies and whatnot you essentially need a full FHS system in there, enough to run wget, gcc and whatnot.

This is incredibly inefficient, and is why all these images use FROM ubuntu, ultimately wasting some 500MB of space.

Even Google with their “distroless” do this, they just pull in busybox. Probably the worst definition of “distroless” ever, they basically just named a debian-based Linux distro “distroless”.

dockertools on the other hand will put only exactly what you need to run your binaries in there, using nix’ dependency inference to define what that actually means, instead of “probably a libc and busybox”.

You can instead use a dockerfile with only an ADD instruction, of course, but this requires using an external build system like nix to create the files to ADD. dockertools just cuts the middle man and makes nix do the ADD.

NobbZ · November 6, 2022, 12:13pm

With multistage docker and statically linked binaries you can get damn close to real distroless, meaning an effective FROM scratch in the final stage.

This is indeed not easy to achieve and usually only seen for rust or golang binaries.

On the other hand side dockerTools are not perfect with the “runtime dependecy detection”, as nix isn’t.

Using the erlang release system resulted in closures that created wastely large closures as erlang (which gets copied into the release!) got still detected as a runtime dependency, as there was some line in a generated file that referenced the erlang binary used to originally build the release.

In addition to that, the default erlang uses some stuff that is not actually required in docker, like systemd integration for EPMD.

So you need to put some manual effort to reduce the runtime closure, which results in building erlang from source. I was able to place a PR to the related builder before it got integrated into nixpkgs.

Also erlang modules are by default built with debug enabled, and copied like that, again resulting in references to the closure they are copied from. Those have to be stripped manually. I was able to place a PR that at least gives us an option to stript those after the fact.

It took me some days to analyze this, and fix the issues in the long term.

Though all in all, due to the things mentioned above, there was an overall overhead of ~150 MiB in the runtime closure.

Regular docker builds from a dockerfile wouldn’t have had this problem and an idiomat multistage Dockerfile usually was less than 50MiB for simple applications.

Achieving the same with nixpkgs’ dockerTools has been a lot more involved and required knowledge in erlang builds, nixpkgs idioms, the erlang sub ecosystem in nixpkgs, and nix tooling to analyze the issues.

So the docker tools aren’t producing smaller images magically. There might be a lot of effort required to actually get them en par with idiomatic docker files.

abathur · November 6, 2022, 3:00pm

I started using it indirectly (through Mach-nix) for a work project because it enabled us to re-use most of our nix-based dev environment to build a container for qc/staging. That’s gone well enough that I’m in the process of doing the same for the production environment (with just a teeny-tiny bit more nix work to leave out some dev dependencies).

jmgilman · November 6, 2022, 5:26pm

Not only that, but it also generally produces smaller images.

For clarity, I’ve been building most of my Docker images with Nix (using nix2container in my case), but I’ll push back on this one a bit. It can produce smaller images, however, your mileage most certainly varies based on the quality of packages you’re pulling in. In one example I spent an entire day trying to fix packages that brought in MBs of wasted space in order to get my container image size down.

The problem is that most of nixpkgs is not optimized for space saving - so this is certainly an extra hurdle that doing Docker the nix way introduces.

abathur · November 6, 2022, 6:02pm

Indeed

I see this as part of a broader problem with communicating precisely about Nix to set the right expectations.

For example, it’s common-ish to see someone express consternation when they run into some sort of cross-platform sharp corner because they thought Nix/nixpkgs were supposed to fix problems like these. It can, and by this point many of them come for ~free, but a lot of them still need humans to run into the problem and figure out what to do instead of shrugging.

Nix/nixpkgs provide a toolkit for doing helpful things like specifying dependencies precisely, cross-compiling, using the same abstractions on multiple platforms, and so on–but packages need to get beaten into shape by people with those needs.

(That might be directly, updating/communicating packaging practices to dump more cruft, building tooling that tries to find packages that some general kinds of cleanup work well on, collecting a good focused guide of steps humans can take to tighten packages up, getting more obvious rules into linters, etc.?)

TLATER · November 7, 2022, 1:48am

Those are some very good caveats, thanks.

I still think making your integration system manage the packages in the image directly is better than going through the intermediate dockerfile, even if the docker ecosystem has good tools and prepared packages available to minimize the overhead these days. Using the integration system keeps dependency information in tact, which at least in theory helps control the final outputs.

But yes, clearly nixpkgs still needs work before I can make bold claims like

peterhoeg · November 7, 2022, 2:19am

Another reason is staying within the ecosystem. I have a pet project (built in Crystal but that doesn’t really matter), where I’m using nix to provide the dev environment as well as build the artifacts. Hydra handles the ongoing building and so on. Throwing in dockerTools in order to get containers is super easy and a lot less work than having to deal with a Dockerfile all of a sudden.

cmkarlsson · November 18, 2022, 3:25am

@NobbZ I just spent a day building a phoenix release using nix with the goal to build a minimal docker image.

I found the same problems you mention here with erlang and systemd being included leading to massive docker images.

Are you saying, that even if you worked around these problems your image size is still 150MiB?

Nix has a lot of promise but, working through the somewhat hard to read documentation to finally manage to build a phoenix derivation and create a docker image only to see such a substandard result was pretty disappointing.

I wish I had seen your post before I started rather than when I was frantically trying to trouble-shoot the problem

NobbZ · November 18, 2022, 6:59am

I was able to build one of my older experiments. It creates an image of ~60MiB gzipped.

The other one doesn’t build at all for some reason I have not yet dug into (the build stales during dependency resolution).

And a third project is currently in the 9ths minute of building Erlang, which usually takes 30 to 40 minutes on my machine. Though I think it will be in the same ballpark as the first regarding image size.

I though I had more things lying around, but that is currently not the case. I will see if I can extract what I have learned to reduce an elixir closure size and create an example application.

EDIT:

The last image built was 22MiB compressed!

cmkarlsson · November 18, 2022, 9:13am

I am playing around with this.

erlang was included because of a reference in erts-*/bin/start (this is even mentioned as a comment in mix-release.nix but they don’t do anything about it) . I don’t think this script is used anywhere so I deleted the file in my postInstall derivation and that saved me 400Mb.

I then use unstable instead of 22.05 and the beam files get stripped. I think that might be the PR you were mentioning in the other post. That reduced it slightly more.

Next I am trying to remove systemd. I am trying to do overrideAttrs on the erlang package and set systemdSupport = false. I can set it to false and nix is kind enough to recompile erlang for me (takes a longer than I remember) but the flag has not actually done anything. It seem like it is still is including systemd. I think it is my lack of knowledge with nix that plays a part and that I am not using overrideAttrs correctly or in the wrong place or something.

NobbZ · November 18, 2022, 9:20am

The last version I have tried (22 MiB) uses nixos-20.09 as a base. I therefore do a lot of stuff, there that is easier to be done in 22.05 or unstable.

I really have to clean that up a bit and prepare a template or blog post or something.

I can not start that today and hope to find some time during the weekend. Though as my son has a scout event time is quite limited.

@moderators can you please split the discussion from Why would someone use dockertools.buildimage over using a Dockerfile? - #11 by cmkarlsson into a new topic? I think its derailing (in a positiv manner).

cmkarlsson · November 18, 2022, 9:21am

No need to rush anything for my sake. Family first I will take this as an opportunity to learn more about nix derivations and see if I can sort it out.

peterhoeg · November 18, 2022, 9:35am

Another thing to try out may be to fiddle with something other than an erlang application as the nix/erlang combination does seem to produce a less than ideal output unless you massage it.

NobbZ · November 20, 2022, 4:42pm

As the change over a naïve appraoch was just a single line, and there was not much to do anymore after all the changes merged, I just dropped a rough template showing the result in a gist:

This can be used as a template for 22 MiB base size compressed images.

The size of the compressed image was already at ~60 MiB when I used beam. So the situation is a lot better than it was before.

IIRC then the size of the naïve approach was about 200 MiB compressed, so more than 3 times bigger as now…

cmkarlsson · November 22, 2022, 2:31am

Thanks!

I will try this out