[OCI images] Is there something similar to "distroless" images built with Nix?

tripleathena · November 24, 2023, 2:29pm

So Google has built this very “involved” tooling (IMO) in order to create “distroless” OCI images: GitHub - GoogleContainerTools/distroless: 🥑 Language focused docker images, minus the operating system.

They contain very few packages, for example their gcr.io/distroless/static-debian12 contains only 3 packages, and it is very appealing to reduce the vulnerability surface.

This kind of image is suppose to have the bare minimum to be able to run other binaries compiled with Go, C++, Rust etc

Now to my question: is there such a thing in the Nix world? With dockerTools, this looks deceivingly trivial:

{ pkgs ? import <nixpkgs> { } }:

pkgs.dockerTools.buildLayeredImage {
  name = "nixbase";
  tag = "latest";
  created = "now";
  contents = [
    pkgs.cacert
    pkgs.iana-etc
    pkgs.tzdata
  ];
}

With the Nix approach, you get reproducible and way simpler tooling to generate the OCI images.

What am I missing (what’s the catch)? Does anyone use such an approach in their production systems?

TLATER · November 24, 2023, 5:41pm

Big pet peeve of mine. Rant incoming:

tl;dr: Fat OCI images are a cultural problem, not a technical one.

“Distroless” has always struck me more as marketing for the abolishment of a mispractice than something particularly novel or interesting. Any build tool worth its salt should be trivially capable of making images like this, it’s effectively just writing a little manifest and collecting some files into a zip, +/- writing some additional text files for user management and whatnot. You can do this in a short shell script, and that particular script actually has a bunch of additional features to allow mimicking Dockerfiles without forcing you into using fat OCI images.

I’ve gotten very annoyed in the past having to explain to people that projects I was part of already were “distroless” by their very nature, and that adopting Google’s base images would do nothing for the project (nor actually be possible, given it involved LXD).

In fact, I believe the images from the “distroless” repo you shared are less minimal than what you could achieve with a properly tweaked build system - yes, the end result is rather minimal, but an application that needs neither libssl nor a libc (incredibly rare, I know), still has some useless gunk even with the most minimal “distroless” image available. A fully custom build can end up without those dependencies.

The only reason people are writing fat docker containers in the first place is that few general-purpose, properly isolated build systems exist, few are adequately easy to use for “simple” use cases, and even fewer existed historically before container technology was as advanced as it is today. So the easiest way to get something relatively reproducible was to take your build env (which was practically always some full Linux distro because it ran baremetal on a server or VM up until this point), throw it into a docker container, add the language-specific build tool of your choice, and then let it build your software in the container runtime.

This means you need dev tooling in your runtime, and it’s relatively tricky to split it out again without having to untangle a big mess, especially since language-specific build tools will typically install things into fhs(-ish) dirs and now you need to guess which artifacts are actually part of your dependencies and software and which are just cruft. Also there’s a whole distro underneath it with its own idiosyncrasies that your software’s dependencies may rely on, and it anyway has useful debugging tools (like a basic shell). So you just ship the whole image and eat the disk space overhead, which is anyway amortized by the layer concept so you don’t care too much (until you deploy hundreds of thousands of poorly-written images and are now officially wasting buckets of money, or try to deploy them into embedded systems - then you try alpine for a while, and find that that is good enough for now, and suddenly it’s 5 years later and Google are proudly announcing “distroless”).

We’re about a decade (wikipedia says cgroups first merged in 2007) into refinement of the technology around this, but nonetheless, if you had an attentive build engineer even in the early days of containers, you could have had a “distroless” image from day one with a little effort. Google didn’t invent some revolutionary concept, as some people seem to think this is, they just wrote down what should be second nature, and probably was for some organizations, but is not general practice, due to people’s use of inadequate tooling (language-specific build tools almost universally have a bootstrapping issue that makes this much harder to solve from the confines of said build tools), inexperience with said tooling and lackluster knowledge of what actually comprises a sufficient runtime.

So yes, nix can produce “distroless” images rather trivially, and this should not be surprising. You’ll find that the “distroless” Google repo is also really just a small handful of relatively simple Bazel instructions for building things, and not particularly complex in isolation.

There is a caveat, however; nix packages, as prepared by contributors to nixpkgs, as a rule care very little about their disk space footprint. As such, you can’t necessarily expect really simple images written on top of nixpkgs to actually be as minimal as they should be. You’ll find spurious debug symbols that shouldn’t be there, but are because of bugs in nixpkgs, and lots of technically optional dependencies everywhere.

This is not an issue with nix, the build tool, but a cultural issue around nixpkgs packages (arguably not an “issue”, it’s likely desirable for nixpkgs’ most common use cases). If you do all the work yourself, you can theoretically achieve the same images with nix that you can achieve with Bazel, with roughly the same amount of effort (though IME nix is easier to use).

It might not be quite as simple as your example, though, especially for more complex packages. You’d probably need to do some upstream work to get best results if you want to piggyback off of nixpkgs, or some downstream work if you don’t.

Google’s images, being designed specifically for this use case, contain software that is already tweaked to be quite minimal, as it originates from a different culture and design goal. As such, getting something good enough may be easier, but an “optimal” solution will not necessarily be any easier to achieve than with nix.

On the other hand, using Google’s images will leave you without a large repository of useful pre-built dependencies that you don’t have to build yourself. So for particularly complex, high-level packages, nix may again be easier to use, albeit at a disk space cost (which you can work towards minimizing upstream).

If we’re just talking about the number of dependencies though, since you’re talking about attack surfaces, I think both should be quite similar.

nixinator · November 24, 2023, 5:42pm

in the early days, containers were NOT full blown operating systems, they were small ‘processes’… because guess what, they are ‘processes’…

then some bright spark, said, lets turn a container into a full fat VM with a full fat OS on it…YAY!

WRONG.

so, yes you can do this, but do you want to use these type of containers at all?

I’m not a container fan, and they create more problems than they solve.

But ,hey if you must succumb to the whale…

this cool ascii cinema was the thing the peaked my interest in nix, and then quickly discovered that i didn’t need containers at all , because they way nix build things, they are ‘contained’ already, and i don’t have to run an container operating system on top of unix, i can use unix primitvies to do everything i need.

Why run two operating systems unix + docker tools, when running just UNIX can do the job.

however if your trapped in docker container legacy systems, nix can build effective docker containers…

you might be interested in https://nixery.dev/

have fun!

nixinator · November 24, 2023, 5:50pm

I would like to thank the mighty @TLATER for a superior answer to this question.

It’s a common problem, technology comes along, it solves a problem… then … hey we can do Y with it now… and Y is a very bad idea but everyone jumps on board anyway because it’s the cool thing down the coffee shop.

The final straw when i saw whole projects being ‘built’ in containers, not deployed in containers… this is the start of the dockerpocalpyse … you don’t want to know what that is , i assure you.

Love or hate docker it gets the job done… and nix can build it all after all. What ever suits you…

tripleathena · November 24, 2023, 9:12pm

Bravo! Love the rant, thanks for sharing it. I am still new to Nix so I wasn’t aware that nixpkgs aren’t as minimal as they can be.

tripleathena · November 24, 2023, 9:15pm

Yes, at $WORK we use Kubernetes Nixery indeed looks interesting! A very creative approach with the paths and packages

nixinator · November 25, 2023, 9:45am

Remember the H in Kubernetes stands for Happiness.

may the nix be with you… always.

igorramazanov · December 16, 2024, 1:24pm

An interesting area is combining Tier-1 bare-metal hypervisors (see Xen) with Unikernels (see Unikraft).

Then you strip out almost everything that’s not needed.

Would be nice to build unikernels with nix.