Not sure if this belongs here, but I just quickly wanted to share an experience I made with Python, Docker and no Nix.
We have a Python project where I work, which is an experiment controls software (i.e. it talks to motors and devices). It depends on the project “pytango” for some its inner workings. pytango is a library that uses the C library “tango”, as the name implies.
The problem arose when I wanted to create a CI job using GitLab CI, just to run the tests with pytest. I had the following job definition (simplified):
tests:
image: python:3.8
script:
# we’re also using numpy in the project
- pip install numpy pytango
- pytest tests
As you can see, I’m using python:3.8
as the Docker base image, so I have python
(version 3.8) and pip
readily at my disposal. Nice! However, when I ran the CI pipeline, it failed pretty quickly.
The culprit was pytango
, which didn’t have any binary package available and then decided to build itself. It failed because it was missing build dependencies, notably boost-python
.
So what to do here? I’m using python:3.8
and what that image contains should be opaque to me, ideally. But in order to build this dependency, I have to look inside. And I see a Debian version in the Docker tags for the image, so hopefully I can just assume I’m on a Debian distribution. I extend the script slightly:
tests:
image: python:3.8
stage: test
script:
- apt-get update && apt-get install -y libboost-python-dev libtango-dev python3-numpy
- pip install pytango
- pytest tests
And I’m a bit furious here already, because the distribution python3-numpy
is of course a different version than what I have in my project. But that could be fixed with a virtualenv, I suppose. Not the point here.
This pipeline fails again, because installing libtango-dev
installs tango
, which, in turn, interactively asks me for something (I think where the tango server is running, or something). So have to disable that by setting DEBIAN_FRONTEND=noninteractive
.
And then the build takes forever. And…fails! And that was the really difficult problem to solve. Because it didn’t fail compiling, it failed trying to link boost_python
. Which made no sense to me.
So I dug around in the contents of libboost-python-dev
and noticed something: the library was called libboost-python39.so
(or slightly similar, this is from memory).
So it’s libboost-python39.so
, but I’m using python:3.8
as the base image. That didn’t really add up. Apparently the base image has a distribution Python version, but installs a different one on top of it. So, switching to the python:3.9
base image worked (the whole CI pipeline worked then). Of course, would I have insisted on using Python version 3.8 (maybe my project doesn’t build with 3.9 yet?), I would be at a loss.
Conclusion: I think there’s some lessons about the current state of software packaging in here:
- It’s normal for language package managers, rather than to error out when discovering a binary is missing, try to build it. This even works, sometimes! But it’s a very flawed approach, because the package manager cannot support every distribution out there, has no root rights to install dependencies, and it’s also duplicated work.
- Using Docker as a base image for CI only goes so far. They doesn’t compose at all (and can’t, by design), and as soon as the most trivial use-case isn’t covered, you have to open it up or build your own base image, which gets very non-trivial.
The whole issue would be completely avoided by having Nix packages for tango, libboost-python and pytango, of course, and the resulting build would be automatically cached as well. Something which I had to add explicitly into my .gitlab-ci.yml
file. Nix would also not have any interactive prompts while installing as well.
Anyhow, hope this was a bit interesting to read and maybe you can convince some people using such stories that Nix might be a good fit for their projects.