Not sure if this belongs here, but I just quickly wanted to share an experience I made with Python, Docker and no Nix.
We have a Python project where I work, which is an experiment controls software (i.e. it talks to motors and devices). It depends on the project “pytango” for some its inner workings. pytango is a library that uses the C library “tango”, as the name implies.
The problem arose when I wanted to create a CI job using GitLab CI, just to run the tests with pytest. I had the following job definition (simplified):
tests:
image: python:3.8
script:
# we’re also using numpy in the project
- pip install numpy pytango
- pytest tests
As you can see, I’m using python:3.8 as the Docker base image, so I have python (version 3.8) and pip readily at my disposal. Nice! However, when I ran the CI pipeline, it failed pretty quickly.
The culprit was pytango, which didn’t have any binary package available and then decided to build itself. It failed because it was missing build dependencies, notably boost-python.
So what to do here? I’m using python:3.8 and what that image contains should be opaque to me, ideally. But in order to build this dependency, I have to look inside. And I see a Debian version in the Docker tags for the image, so hopefully I can just assume I’m on a Debian distribution. I extend the script slightly:
tests:
image: python:3.8
stage: test
script:
- apt-get update && apt-get install -y libboost-python-dev libtango-dev python3-numpy
- pip install pytango
- pytest tests
And I’m a bit furious here already, because the distribution python3-numpy is of course a different version than what I have in my project. But that could be fixed with a virtualenv, I suppose. Not the point here.
This pipeline fails again, because installing libtango-dev installs tango, which, in turn, interactively asks me for something (I think where the tango server is running, or something). So have to disable that by setting DEBIAN_FRONTEND=noninteractive.
And then the build takes forever. And…fails! And that was the really difficult problem to solve. Because it didn’t fail compiling, it failed trying to link boost_python. Which made no sense to me.
So I dug around in the contents of libboost-python-dev and noticed something: the library was called libboost-python39.so (or slightly similar, this is from memory).
So it’s libboost-python39.so, but I’m using python:3.8 as the base image. That didn’t really add up. Apparently the base image has a distribution Python version, but installs a different one on top of it. So, switching to the python:3.9 base image worked (the whole CI pipeline worked then). Of course, would I have insisted on using Python version 3.8 (maybe my project doesn’t build with 3.9 yet?), I would be at a loss.
Conclusion: I think there’s some lessons about the current state of software packaging in here:
- It’s normal for language package managers, rather than to error out when discovering a binary is missing, try to build it. This even works, sometimes! But it’s a very flawed approach, because the package manager cannot support every distribution out there, has no root rights to install dependencies, and it’s also duplicated work.
- Using Docker as a base image for CI only goes so far. They doesn’t compose at all (and can’t, by design), and as soon as the most trivial use-case isn’t covered, you have to open it up or build your own base image, which gets very non-trivial.
The whole issue would be completely avoided by having Nix packages for tango, libboost-python and pytango, of course, and the resulting build would be automatically cached as well. Something which I had to add explicitly into my .gitlab-ci.yml file. Nix would also not have any interactive prompts while installing as well.
Anyhow, hope this was a bit interesting to read and maybe you can convince some people using such stories that Nix might be a good fit for their projects.