[OCI images] Is there something similar to "distroless" images built with Nix?

TLATER · November 24, 2023, 5:41pm

Big pet peeve of mine. Rant incoming:

tl;dr: Fat OCI images are a cultural problem, not a technical one.

“Distroless” has always struck me more as marketing for the abolishment of a mispractice than something particularly novel or interesting. Any build tool worth its salt should be trivially capable of making images like this, it’s effectively just writing a little manifest and collecting some files into a zip, +/- writing some additional text files for user management and whatnot. You can do this in a short shell script, and that particular script actually has a bunch of additional features to allow mimicking Dockerfiles without forcing you into using fat OCI images.

I’ve gotten very annoyed in the past having to explain to people that projects I was part of already were “distroless” by their very nature, and that adopting Google’s base images would do nothing for the project (nor actually be possible, given it involved LXD).

In fact, I believe the images from the “distroless” repo you shared are less minimal than what you could achieve with a properly tweaked build system - yes, the end result is rather minimal, but an application that needs neither libssl nor a libc (incredibly rare, I know), still has some useless gunk even with the most minimal “distroless” image available. A fully custom build can end up without those dependencies.

The only reason people are writing fat docker containers in the first place is that few general-purpose, properly isolated build systems exist, few are adequately easy to use for “simple” use cases, and even fewer existed historically before container technology was as advanced as it is today. So the easiest way to get something relatively reproducible was to take your build env (which was practically always some full Linux distro because it ran baremetal on a server or VM up until this point), throw it into a docker container, add the language-specific build tool of your choice, and then let it build your software in the container runtime.

This means you need dev tooling in your runtime, and it’s relatively tricky to split it out again without having to untangle a big mess, especially since language-specific build tools will typically install things into fhs(-ish) dirs and now you need to guess which artifacts are actually part of your dependencies and software and which are just cruft. Also there’s a whole distro underneath it with its own idiosyncrasies that your software’s dependencies may rely on, and it anyway has useful debugging tools (like a basic shell). So you just ship the whole image and eat the disk space overhead, which is anyway amortized by the layer concept so you don’t care too much (until you deploy hundreds of thousands of poorly-written images and are now officially wasting buckets of money, or try to deploy them into embedded systems - then you try alpine for a while, and find that that is good enough for now, and suddenly it’s 5 years later and Google are proudly announcing “distroless”).

We’re about a decade (wikipedia says cgroups first merged in 2007) into refinement of the technology around this, but nonetheless, if you had an attentive build engineer even in the early days of containers, you could have had a “distroless” image from day one with a little effort. Google didn’t invent some revolutionary concept, as some people seem to think this is, they just wrote down what should be second nature, and probably was for some organizations, but is not general practice, due to people’s use of inadequate tooling (language-specific build tools almost universally have a bootstrapping issue that makes this much harder to solve from the confines of said build tools), inexperience with said tooling and lackluster knowledge of what actually comprises a sufficient runtime.

So yes, nix can produce “distroless” images rather trivially, and this should not be surprising. You’ll find that the “distroless” Google repo is also really just a small handful of relatively simple Bazel instructions for building things, and not particularly complex in isolation.

There is a caveat, however; nix packages, as prepared by contributors to nixpkgs, as a rule care very little about their disk space footprint. As such, you can’t necessarily expect really simple images written on top of nixpkgs to actually be as minimal as they should be. You’ll find spurious debug symbols that shouldn’t be there, but are because of bugs in nixpkgs, and lots of technically optional dependencies everywhere.

This is not an issue with nix, the build tool, but a cultural issue around nixpkgs packages (arguably not an “issue”, it’s likely desirable for nixpkgs’ most common use cases). If you do all the work yourself, you can theoretically achieve the same images with nix that you can achieve with Bazel, with roughly the same amount of effort (though IME nix is easier to use).

It might not be quite as simple as your example, though, especially for more complex packages. You’d probably need to do some upstream work to get best results if you want to piggyback off of nixpkgs, or some downstream work if you don’t.

Google’s images, being designed specifically for this use case, contain software that is already tweaked to be quite minimal, as it originates from a different culture and design goal. As such, getting something good enough may be easier, but an “optimal” solution will not necessarily be any easier to achieve than with nix.

On the other hand, using Google’s images will leave you without a large repository of useful pre-built dependencies that you don’t have to build yourself. So for particularly complex, high-level packages, nix may again be easier to use, albeit at a disk space cost (which you can work towards minimizing upstream).

If we’re just talking about the number of dependencies though, since you’re talking about attack surfaces, I think both should be quite similar.