A little anecdote about a non-nixified dystopia

Not sure if this belongs here, but I just quickly wanted to share an experience I made with Python, Docker and no Nix.

We have a Python project where I work, which is an experiment controls software (i.e. it talks to motors and devices). It depends on the project “pytango” for some its inner workings. pytango is a library that uses the C library “tango”, as the name implies.

The problem arose when I wanted to create a CI job using GitLab CI, just to run the tests with pytest. I had the following job definition (simplified):

tests:
  image: python:3.8
  script:
    # we’re also using numpy in the project
    - pip install numpy pytango
    - pytest tests

As you can see, I’m using python:3.8 as the Docker base image, so I have python (version 3.8) and pip readily at my disposal. Nice! However, when I ran the CI pipeline, it failed pretty quickly.

The culprit was pytango, which didn’t have any binary package available and then decided to build itself. It failed because it was missing build dependencies, notably boost-python.

So what to do here? I’m using python:3.8 and what that image contains should be opaque to me, ideally. But in order to build this dependency, I have to look inside. And I see a Debian version in the Docker tags for the image, so hopefully I can just assume I’m on a Debian distribution. I extend the script slightly:

tests:
  image: python:3.8
  stage: test
  script:
    - apt-get update && apt-get install -y libboost-python-dev libtango-dev python3-numpy
    - pip install pytango
    - pytest tests

And I’m a bit furious here already, because the distribution python3-numpy is of course a different version than what I have in my project. But that could be fixed with a virtualenv, I suppose. Not the point here.

This pipeline fails again, because installing libtango-dev installs tango, which, in turn, interactively asks me for something (I think where the tango server is running, or something). So have to disable that by setting DEBIAN_FRONTEND=noninteractive.

And then the build takes forever. And…fails! And that was the really difficult problem to solve. Because it didn’t fail compiling, it failed trying to link boost_python. Which made no sense to me.

So I dug around in the contents of libboost-python-dev and noticed something: the library was called libboost-python39.so (or slightly similar, this is from memory).

So it’s libboost-python39.so, but I’m using python:3.8 as the base image. That didn’t really add up. Apparently the base image has a distribution Python version, but installs a different one on top of it. So, switching to the python:3.9 base image worked (the whole CI pipeline worked then). Of course, would I have insisted on using Python version 3.8 (maybe my project doesn’t build with 3.9 yet?), I would be at a loss.

Conclusion: I think there’s some lessons about the current state of software packaging in here:

  1. It’s normal for language package managers, rather than to error out when discovering a binary is missing, try to build it. This even works, sometimes! But it’s a very flawed approach, because the package manager cannot support every distribution out there, has no root rights to install dependencies, and it’s also duplicated work.
  2. Using Docker as a base image for CI only goes so far. They doesn’t compose at all (and can’t, by design), and as soon as the most trivial use-case isn’t covered, you have to open it up or build your own base image, which gets very non-trivial.

The whole issue would be completely avoided by having Nix packages for tango, libboost-python and pytango, of course, and the resulting build would be automatically cached as well. Something which I had to add explicitly into my .gitlab-ci.yml file. Nix would also not have any interactive prompts while installing as well.

Anyhow, hope this was a bit interesting to read and maybe you can convince some people using such stories that Nix might be a good fit for their projects.

7 Likes

You can forego docker in favor of a nix shell in github CI, even when not using it elsewhere in the project: https://github.com/marketplace/actions/nix-shell. May help resolving dependency messes like this in the future.

In fact, I think that will even run inside a docker container if you do provide one :slight_smile:

2 Likes

Yes that’s a good idea, and I already set up a gitlab-runner instance and a machine with Nix on it.

However, I’m hesitant to go through with it. Not because I’m not convinced it would solve the problems, but because I don’t want to introduce this new element without educating people about it. Not sure how I will solve thi issue though.

just do it, nobody will notice :wink: and then upgrade to https://hercules-ci.com/ and cachix.

A long long time ago I once upgraded all our printer servers to linux, as windows would crash every 4 hours with the load.

The management specifically said ‘no linux, and no open source software on the network’.

i secretly replaced it anyway, and nobody knew…worked without out a reboot for years.

I later found out the our microsoft software rep, had said to management if we used anything but windows on the network we would loose all our microsoft discounts.

That the day i found out it’s not the best solutions win out…, it’s about something else, I’ll let you guess what that is.

I think that’s why that action is so nice, it’s very straightforward to the point that it using nix is just an implementation detail. “I’m using an action to download our dependencies in CI” is much easier of a sell than “let’s start using nix to build our projects”.

But I appreciate that it’s not always easy, too :wink: Wish I could convince more of my coworkers to go down the nix route, at least the ones I work with directly.

when you can’t convince your company that things are a good idea, that’s how startups are made.

1 Like

I just wanted to take off my hat :grin:

1 Like