Why is it so hard to use a Python package?

All I want to do is try out a Python package, spacy-wordnet. But it’s not in nixpkgs, so here we go on an epic journey.

The readme says to run: pip install spacy-wordnet. But that apparently doesn’t work on NixOS.

And pypi2nix is broken. So I turn to poetry2nix. I create a new Poetry project with poetry init, then poetry add spacy-wordnet. Then I add a default.nix that looks like this:

{ pkgs ? import <nixpkgs> {} }:
let
  myAppEnv = pkgs.poetry2nix.mkPoetryEnv {
    projectDir = ./.;
    editablePackageSources = {
      my-app = ./src;
    };
  };
in myAppEnv.env

per instructions in poetry2nix. That fails with pip._internal.exceptions.InstallationSubprocessError: Command errored out with exit status 1

So I try to get together a nix expression for it. There’s no resolver, apparently, so I have to declare each of the individual dependencies, and write build instructions for each of the dependencies, too.

So now I’m not only packaging spacy-wordnet, I’m packaging pyscaffold, and pyscaffold’s dependency configupdater. And probably more.

with import <nixpkgs> {};

( let
  spacyWordnet = pkgs.python3Packages.buildPythonPackage rec {
      pname = "spacy-wordnet";
      version = "0.0.5";

      src = pkgs.python3Packages.fetchPypi {
        inherit version;
        inherit pname;
        sha256 = "bErMjM0VIsQHnPFoEe80fPClpJyiCyNBuapegMgcPbc=";
      };

      propagatedBuildInputs = with pkgs.python3Packages; [ nltk spacy pyscaffold ];
      doCheck = false;
    };

  pyscaffold = pkgs.python3Packages.buildPythonPackage rec {
      pname = "PyScaffold";
      version = "4.2.1";

      src = pkgs.python3Packages.fetchPypi {
        inherit version;
        inherit pname;
        sha256 = "yM+pmDUD8xswH0sL7AqPQVGPmjo4U0cc6U+fncdwo+I=";
      };

      propagatedBuildInputs = with pkgs.python3Packages; [ setuptools-scm configupdater ];
      doCheck = false;
    };

  configupdater = pkgs.python3Packages.buildPythonPackage rec {
      pname = "configupdater";
      version = "3.1";

      src = pkgs.python3Packages.fetchPypi {
        inherit version;
        pname = "ConfigUpdater";
        sha256 = "3cxSUPUIuRMcRf0dvOrj8RKQfd11l9oc/zDFG/fIfts=";
      };

      propagatedBuildInputs = with pkgs.python3Packages; [ ];
      doCheck = false;
    };
in pkgs.python3.buildEnv.override rec {
    extraLibs = with pkgs.python3Packages; [
      spacy
      spacyWordnet
      pandas
      # spacy_models.en_core_web_lg
      scikitlearn
      nltk
      altair
      numpy
    ];
  }).env

This doesn’t work because it says that configupdater is missing for pyscaffold. But it’s clearly there.

And so then I notice that I’m many hours into this packaging problem, and all I want to do is just try out one stupid python package. Why is this so hard to do on NixOS? Am I just supposed to use Docker for everything?

What do Python developers do on NixOS? Does everyone just do development in an Ubuntu VM? I’m at my wits’ end here.

4 Likes

It isn’t really the “nix way”, but if you need a trapdoor you can use virtualenv:

$ nix-shell -p python3 python3Packages.virtualenv

$ virtualenv venv
created virtual environment CPython3.9.6.final.0-64 in 155ms
  creator CPython3Posix(dest=/home/abathur/work/blah/venv, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/abathur/.local/share/virtualenv)
    added seed packages: pip==21.2.4, setuptools==58.1.0, wheel==0.37.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator

$ source ./venv/bin/activate

$ pip install spacy-wordnet
Collecting spacy-wordnet
  Using cached spacy_wordnet-0.0.5-py2.py3-none-any.whl (650 kB)
Collecting nltk<3.4,>=3.3
  Using cached nltk-3.3-py3-none-any.whl
Requirement already satisfied: six in /nix/store/7lbsbjcifm349igq7xqcrg51s4gibnq5-python3.9-six-1.16.0/lib/python3.9/site-packages (from nltk<3.4,>=3.3->spacy-wordnet) (1.16.0)
Installing collected packages: nltk, spacy-wordnet
Successfully installed nltk-3.3 spacy-wordnet-0.0.5
WARNING: You are using pip version 21.2.4; however, version 22.1 is available.
You should consider upgrading via the '/home/abathur/work/blah/venv/bin/python -m pip install --upgrade pip' command.

$ python -c 'import spacy_wordnet; print(spacy_wordnet)'
<module 'spacy_wordnet' from '/home/abathur/work/blah/venv/lib/python3.9/site-packages/spacy_wordnet/__init__.py'>
4 Likes

I’m not sure if the people most-involved with Python packaging (maybe @FRidh @jonringer @adisbladis @DavHau?) agree, but my impression is that it’s because Python package management is largely an imperative mess, and it’s hard to square an imperative mess with Nix.

Because they’re hard to square, people have tried again and again to build good toolchains for automating this. Now there are multiple generations of toolchains for doing so, some of which are no longer maintained, and newer users generally won’t have perspective on where to start.

(This is not novel to the Python ecosystem; there’s been more than one attempt at building a toolchain for most of the language-specific package management ecosystems.)

5 Likes

Possibly what you’re looking for is this: nixpkgs/python.section.md at 49829a9adedc4d2c1581cc9a4294ecdbff32d993 · NixOS/nixpkgs · GitHub

With that small amount of scaffolding (creating a default.nix with venvShellHook and such) you’re mostly off to the races and can mostly just do Python things as you’re used to. The exception is with non-Python dependencies. You may need to add those to buildInputs and sometimes even mess with LD_LIBRARY_PATH.

2 Likes

Give GitHub - DavHau/mach-nix: Create highly reproducible python environments a go. Unlike poetry2nix & co, it uses a full database to map out pip packages, and is pretty good in just making things work without packaging them IME.

That said:

This is legitimately a last resort that I have taken twice now. Once for bazel, once for maptool, both java applications. Some build tools are simply awful and cannot be made to behave in any way, and you’ll struggle using them on NixOS, or any other OS that doesn’t follow ubuntu’s rootfs structure as a result.

It’s a shame, but unless you believe there should be no OSes besides ubuntu, it’s a them bug. NixOS is in the same position as any minority userbase. Just like how getting Windows games to run on Linux can sometimes be painful, even if it’d be fairly trivial for their developers to just support it, getting ubuntu software to run on NixOS can sometimes be a pain.

It’s up to us to push the industry into a direction in which our use case gets first class support more often :slight_smile:

8 Likes

You could give conda-shell a try. I’m able to install spacy-wordnet through conda. Conda-shell has often been the last resort for me, I’ve not had to go as far as running a VM.

conda?
or better micromamba?

Why is it so hard to use a Python package?

If the ground base you are working is nicely said not great, then you can only do so much work on your own to make it great. If the ecosystem around python would be great and portable, nix could also do so much more.

…try pip install, …try mach-nix, …try python3Packages.virtualenv, …try venvShellHook, …try poetry2nix, …try conda-shell, …try micromamba

That’s a little too many “tries” in entirely different directions, enough to overwhelm just anyone. This is exactly why python infrastructure in general is bad (could replace the list with “pip, distutils, setuptools, setup.py, pyproject, poetry, pipenv, pyenv, conda, …” and it would just as frustrating), and why the existing python UX in nixpkgs, plainly speaking, sucks - especially for newcomers. I don’t think a list of “tries” should be the first thing we suggest when explaining python in nixpkgs

To yours, @JonathanReeve, questions:

Why is this so hard to do on NixOS?

Personally, I’m thinking of two partial answers: for pre-built packages (e.g. wheels from pypi), and for nix-built packages. Disclaimer: I do not mean these as objective “facts”, other people may disagree in their evaluation

The pre-built packages are hard to use when they include or depend on any shared libraries (.so). This is because pre-built python packages are not cross-platform: in fact, they’re usually built for “manylinux”, which effectively means some old CentOS distribution with a FHS environment (including /usr/lib and a dynamic linker). The implication is that to use a pip install-ed package more complicated than pure sdist one has to run the python process in an environment that mimics that of “manylinux”. This is the mentioned conda-shell approach: one usually utilizes nixpkgs’ buildFHSUserEnv to create a (“chroot”) directory with all the needed dependencies in conventional FHS paths (/usr/lib, et c.) and spawns a shell in a mount namespace, that it would appear to see these paths

The nix-built python packages are much easier to consume. Somewhat ironically, however, I’d say they’re not nearly as convenient as nix-built binary libraries, in terms of handling transitive dependencies. When one consumes “native” libraries in Nix (say in buildInputs of mkShell or mkDerivation) one says “I want to make these (list) libraries visible” and that just works: e.g. nix will set up environment variables so that cmake can automatically discover the listed dependencies. When one uses an app that links dynamically, that will know where to find its dynamic dependencies through its header, and the dependencies will know where to find their transitive dependencies alike. Python story is quite different: rather than saying “I want to expose these libraries” one has to say “I want a python interpreter that can import these libraries” (python3.withPackages)

Of course mirroring all of pypi in nixpkgs would be infeasible and unsustainable. Solutions like mach-nix automate generating nix expressions for consuming pre-built wheels (and more, like conda packages). Other solutions, like poetry2nix, automate generating nix expressions with full build recipes, e.g. using poetry’s lock-file.

What do Python developers do on NixOS? Does everyone just do development in an Ubuntu VM? I’m at my wits’ end here

Most of the time I just use nix-built python packages. That includes heavy stuff that builds on top of pytorch/tensorflow and CUDA. When I recurrently need to use a package not provided in nixpkgs, I usually just write an expression for it: either right in the project tree, or in a nur repo.

Of course, quite often I need to just quickly try something out and cannot waste time on packaging.
For these cases I have a pre-built FHS environment (a la conda-shell) that I can enter and use pip and conda. I’m also running jupyterhub, which I spawn inside that FHS environment too. I install jupyter user kernels both from conda and from per-project nix-shells. I prefer to use nix-shell (in combination with nix-direnv and jupyter user kernels) whenever possible, because conda occasionally breaks (e.g. last time I checked it couldn’t detect GPU, although that used to work before)

I haven’t used mach-nix or poetry2nix much (I’m not even sure if I described what they do correctly), but it’s not surprising that they have rough edges: they’re solving a very ill-posed problem.

6 Likes

sounds reasonable :slight_smile:


In short:
If you know what packages you need for your project (check it they exist in nixpkgs (or nur) and or if you can build those which are missing)

If that is to much “effort”, you don’t really lose anything starting up with mach-nix


Do you have any example code for your setup? (I’m not totally sure if I understand it correctly)

I’ve been in your same situation and it’s certainly frustrating. However the most frustrating part is that every 2nix solution lacks some feature.

I think mach-nix and poetry2nix are closest to perfection. I think a combination of both would be the best. mach-nix doesn’t support editable installs, and that’s the worst problem for using it for development.

Anyways, to the topic. How I solved this issue? Well, instead of wanting to use nix for development and packaging, I switched over to use raw poetry for development and nix for packaging.

Then I just used devshell to download the python and poetry versions I need and set up the required environment variables.

I’m following that approach in Mr. Chef. In other project that is not currently open source I also needed binary extensions and this is what I did in devshell to let poetry work as expected. I think there should be some built-in support in devshell for Python development, it shouldn’t be too hard.

Your IDE will usually have nice support for Poetry and a virtualenv. Your pythonista workmates will also be famliar with that. Just provide common python and poetry versions and keep on developing with that. It should be enough.

Finally, package the final product with poetry2nix. If you get missing dependencies, usually adding them as dev-dependencies with poetry fixes the problem.

You can also provide an alternate shell to develop with poetry2nix. But anyways you’re gonna need poetry to interact with the environment, and poetry is gonna create a virtualenv, so… just assume that and be happy with it! :slightly_smiling_face:

1 Like

What do Python developers do on NixOS?

Personally I’ve been using micromamba for quite some time, works well enough.

If you want to then package your app using Nix this is not an option of course, but if you ship to e.g. a docker image that’s not a problem.

e.g.

$ nix-shell -E 'with import <nixpkgs> {}; (pkgs.buildFHSUserEnv { name = "fhs"; targetPkgs = p: [p.micromamba]; }).env'
$ micromamba create -n spacy-wordnet python==3.6.15 -c conda-forge -y
$ eval "$(micromamba shell hook -s bash)"
$ micromamba activate spacy-wordnet
$ pip install spacy spacy-wordnet
$ python
>>> import spacy
>>> from spacy_wordnet.wordnet_annotator import WordnetAnnotator
...

I used to use conda (via conda-shell) but I gave up on it long ago… conda is too slow and unreliable.
Conda was recently updated in nixpkgs (it used to be an ancient version) so maybe it’s usable now but IME even the latest versions of conda are slower and more complicated than micromamba.

1 Like

And so then I notice that I’m many hours into this packaging problem, and all I want to do is just try out one stupid python package. Why is this so hard to do on NixOS? Am I just supposed to use Docker for everything?

What do Python developers do on NixOS? Does everyone just do development in an Ubuntu VM? I’m at my wits’ end here.

More generally, I’d say this bad UX is typical for NixOS when something goes wrong.

With NixOS, when things work, it’s pretty neat. (With NixOS, it’s useful to be able to have the system config all in a file I can track in VCS, to have confidence this is the state the system will end up in, to be able to use the NixOS modules for all sorts of fancy services. I love being able to write Nix for some old projects and not have to later worry about installing things to work on the project).

When things go badly… NixOS’s UX has much more friction compared to more popular Linux distributions.

For one: with other Linux distributions, the system is malleable. Other Linux distributions don’t restrict configuring starting from a single file, etc. For another: with other Linux distributions, I don’t need a comprehensive understanding of what’s going on to fix the problem. It’s easy to find a StackOverflow answer to fix whatever problem.

When things go badly with Nix, you may need to understand any of the Nix, the Nix code you’re using, and what the compiler or other program internals are trying to do. – NixOS currently … is hard to recommend as practical for “I just need to get the job done” because of this risk of hitting something difficult in a long tail of things a programmer will want to do, even if the benefits from having Nix descriptions of a package are fantastic.

TBH, when I encounter something bad like this, I note to myself “I had trouble doing this”, and come back to doing it more successfully later (while using a VM or Docker or whatever in the meantime).

6 Likes

Conda still has more features than micromamba, and of course you can just install mamba in your conda environment if you want that functionality.

I tried using poetry2nix for development, but especially with latest packages something always tended to be broken. Using conda, which is the standard in data science, development is seamless and without issues for me. I like nix for a lot of other things (like setting up my tools etc), but for python I think it is at least currently not worth the effort.

Anyways, to the topic. How I solved this issue? Well, instead of wanting to use nix for development and packaging, I switched over to use raw poetry for development and nix for packaging .

This is exactly the workflow I had in mind when I created poetry2nix.
Having been in quite a few Python teams before, a few of which containing a mix of Python and Nix, I realised that Python developers largely saw Nix as a hurdle.
For the most part I wanted Nix not to be in the critical path for Python devs and just get out of their way.

I think there should be some built-in support in devshell for Python development, it shouldn’t be too hard.

This is something I’ve wanted too for quite a while, and something that I think is much easier to accomplish now that the poetry2nix overrides are starting to become more structured (build flag fixups are separate from build inputs for example).
I’m not sure about the UX around this though, suggestions welcome!

2 Likes

It would be nice if something in poetry2nix readme states this. I had to go over several loops before realizing this was the path to go.

Initially I thought I could just replace the full python dev toolbelt with poetry2nix.

The fix wasn’t so hard once found. But finding it was crazy hard.

Maybe we can take a look to how other languages are supported in devshell and provide the necessary equivalents for python (maybe with special support for poetry, given poetry2nix is the best supported bridge these days IMHO).

1 Like

Has anyone considered trying to directly tackle some of the problems upstream in python & python tooling instead of trying to work around them in nix?

(I’m not sure what it would take but, for example, the ability to use multiple versions of python packages is something that comes to mind which would sure be nice to have and sounds like a huge deficiency in python itself.)

1 Like

That’s simply not going to happen as the blocker is fundamental to how the module system works.

I suppose multiple versions is kind of an extreme example that would require someone to go through the PEP process.

I just like to challenge the “inside-the-box” thinking that leads to working around problems - maybe someone out there is motivated enough to go to the source and investigate.

Well, the entire point of NixOS is that it is reproducible and for it to be reproducible you cannot just change anything in any file.
Also not having a good understanding what is going on costs you in the best case some time and in the worst case could have devastating consequences like major data loss.

lets sum up my conda experience and why I will never again touch it:
I wanted to install a deprecated package which was replaced in the current python version. Instead of telling me that after a few seconds, conda calculated effortless for over 10 minutes to present me a solution which downgraded my python version and changed almost every package version I had installed.

You cannot easily change the python community and the mindset in it which is IMO the bigger problem. If the community from the ground up would have been more build with a distro mindest we could be in a totally different place.

3 Likes
Hosted by Flying Circus.