The future of Python build systems and Gentoo

https://blogs.gentoo.org/mgorny/2021/11/07/the-future-of-python-build-systems-and-gentoo/

I’m curious to what extent we’ll be affected by these problems.

Why do programming languages (npm,python) care more about creating their own package manager instead of using the already existence ones?

because language-specific package managers are simpler to get right (of course Python spectacularly failed there, but successful examples abound).

2 Likes

This doesn’t convince me at all. Package management is a mechanism, getting it right shouldn’t have anything or very little to do with the language ecosystem to package software for. The reason why I originally came to Nix was in search of a solution that obviates those island solutions.

Without being able to substantiate it I suspect that historically the attempts at package managers were so badly portable to new use cases and their source so hard to read that it seemed it would be easier to roll your own instead of using or adapting an existing one.

The mechanisms by which each language ecosystem functions are highly specific, sometimes involve hard-coded paths, and often depend on language syntax or details of the specific implementation. I think it’s difficult, if not impossible, to write a real generic implementation of this.

Traditionally, there has also been little in terms of cross-platform support for specific package managers (nix is quite rare in this and even that requires a POSIX environment), so as a language maintainer you either roll your own or accept that you’re going to not have support for a very large portion of your potential userbase. Given evolution, languages that follow the POSIX-only path get crowded out.

Those things together mean that you’re going to have a language-specific manager that knows everything about how your language resolves libraries, and works together with the underlying system as little as necessary to keep things working among all major platforms, and some minor ones a business or two is interested in.

Distro package managers, that try to be generic, usually focus on making C work first, since that’s necessary to get a basic system running, and then add hacky support for various languages that do their own thing when it is realized that not all packages listen to LD* and pkg-config, even if they use dependencies that do. Nix is hardly an exception in this. It’s natural that there is clash between these ideologies, and no easy way to solve that.

I just hope distros remain necessary and we don’t get to a world where each one is basically a language + its package manager. See xkcd: Operating Systems.

4 Likes

Looks like NixOS already has support for the new package type (pyproject). Although we currently default to setuptools.

In mk-python-derivation.nix:

# Several package formats are supported.
# "setuptools" : Install a common setuptools/distutils based package. This builds a wheel.
# "wheel" : Install from a pre-compiled wheel.
# "flit" : Install a flit package. This builds a wheel.
# "pyproject": Install a package using a ``pyproject.toml`` file (PEP517). This builds a wheel.
# "egg": Install a package from an egg.
# "other" : Provide your own buildPhase and installPhase.
, format ? "setuptools"

What could be annoying is if projects which previously shipped data files as part of the python package start installing/downloading those files at runtime. Those packages would become less reproducible.

The more annoying part is being able to bootstrap python packages. Even with just poetry-core, there’ still some ~50 python package dependencies:

$ nix-store --query --requisites /nix/store/xlfbcsggn4klxgjq77xy742cp2h3jwkx-python3.9-poetry-core-1.0.7.drv | grep python3 | wc -l
60

If any of those decide to require poetry core, then we have a circular dependency, will start having to pin old version just for bootstrapping.

I really think that the better approach would have been to make, “creation and installation of an individual wheel is part of the interpreter’s responsibility, but frontends can do their own <file> -> wheel workflow and dependency resolution workflows”.

Python’s, “delegate the responsibility of implementation to the community” just makes it more fractured and broken.

3 Likes

Also, similar but related post: Python: Please stop screwing over Linux distros

1 Like

It’s quite easy to rant about Python packaging, but it has been improving a lot. Some of these changes are however relatively large and not understood always by distro packagers. Additionally, some features distro packagers would like are not considered, but that’s mostly, judging from following the discussions on packaging PEP’s because these packagers are often just not very involved. This is I think the key part, distro packagers are maybe more familiar with multiple packaging systems and details, e.g. also related to cross-compilation, which the people typically involved in the Python packaging design are just not. They are however very open and thus it’s IMO mostly thus the lack of communication that’s the issue.

About the items touched upon:

  1. Deprecating distutils is big. It affects many packages building extension modules. Setuptools is especially affected. It did not go well directly but that’s often the case when it comes to setuptools and the maintainer acted quick upon issues. Way forward is to use either sysconfig directly or even better using more modern build backends such as mesonpep517 which is a tiny interface to using meson.
  2. Not having toml in stdlib is indeed unfortunate now that pyproject.toml depends on it. That, along with having multiple backends also makes bootstrapping a pain.
  3. Having these new backends is great. There is now e.g. flit which is a very lightweight declarative backend for packaging pure Python packages. Then there is mesonpep517 which looks almost the same as flit, but drives meson. This is far better than having setuptools needing to know how to drive compilers. This is also the reason scipy is now converting to using meson as build backend. Furthermore, because it became possible to write new backends and thereby experiment, there are now some additional PEP’s to further standardize some items that were found to be working well and typically shared among backends.
  4. packing into a zip, and then unpacking again is indeed silly, but so be it, completely insignificant.
  5. deprecating setup.py install makes a lot of sense. There are many more other backends now, and they are all driven in the same way.
  6. data_file support. Yes, this is one we noticed already many years back when switching to wheels. It is unfortunate this one hasn’t been picked up since, but again, those that do find it important can propose a PEP on how to deal with this.
  7. In the end the author made a lot of custom changes to make the packaging work on Gentoo and now it won’t work any more. Too bad for them.
2 Likes

It was my plan to swap to pyproject already over a year ago but did not get to it. We really should, and at the same time get rid of format and let users add the relevant hooks (that is, build backends) themselves. format was something I wrote before PEP 517 was there.

Why should they, and which should they pick?

I think that package managers doing language dependency management is an artifact from the time where most programs were written in C, and C didn’t have dependency management. Only the latest generation of system package managers (Flatpak, Nix, …) even support something like having multiple different versions of a dependency installed. Even that is still not enough: development need per project software installs, custom dependency resolution and much more. Unless somebody wants to implement every dependency resolution algorithm every programming language ever used, language-specific package managers will always be a thing. We should steer away from that “ideal”, and stop frowning upon language specific package managers.

Package management requires dependency management, and this is highly language specific. Traditional package managers have adopted C style dependency handling and more or less coerced all other packages into the same mold. A package manager will always be built upon some model around how things should work internally, and will need a variable amount of hacking to make all of the packages fit.

So instead of a judgmental and antagonizing “why are they doing this?” (I think the Gentoo folks are especially guilty of this), let’s rather ask “how can these systems work together in peace?” At the moment, there is no clear answer for this, but as far as I can tell it will involve lock files and handling the library dependency resolution off to the language package managers. The alternative would be to re-implement all features of all language package managers (in Nix), which I wouldn’t consider feasible or worthwhile.

4 Likes
Hosted by Flying Circus.