Share your impure Python ML shells/envs

Python packaging in Nix is a pain because Python devs seem to not consider reproducibility past pip or if we’re lucky poetry.

I’ve had mixed luck in the past with shells meant to just pip install something. I’m into LLM’s now so these examples come to mind:

  • koboldAI

  • vllm

Another issue is that when something is packaged and I encounter an issue, I’m not confident it isn’t an issue with how it was packaged.

Thus, my reason for asking for tips, shells, etc how others “install Python things normally”.

A nix shell with core python packages (numpy, pytorch, scipy) + a virtual env created with the shell python + pip installing the rest works extremely well in my experience. Sure, it’s not technically reproducible, but it’s pretty close.

Looking at those packages in particular, KoboldAI seems like it might be hard, but surprisingly, Cuda on nixos is pretty reliable (shout-out to whoever maintains that) so vllm shouldn’t be so bad to set up.

This still keeps being tricky. I’ve tried several methods over the past year or so.

  • LD_LIBRARY_PATH is a way to make sure specific library paths are preferred over others. This usually works for libraries like openssl. Openssl is often interchangeable with other versions of openssl. It will fail badly for glibc: one application is compiled against version X and the other against version Y. With LD_LIBRARY_PATH both are forced to use the same version, which will fail, because glibc isn’t easily interchangeable.
    So, this method isn’t great.
  • poetry2nix is much more strict. It’ll pick a poetry project and convert all python package sources to Nix derivations. It supports adding/overriding build information of each package. For instance, adding buildInput of native libraries.
    The problem I ran into was several python libraries that didn’t have very reproducible builds or not adhering to implicit standards. For instance, building a cargo project within the python package, but without a lock file. Impossible to do within Nix without supplying such a lock file.
    Eventually I couldn’t get things working. It sometimes requires knowledge of all the build systems used within the python ecosystem.
  • autopatchelf is a tool within nixpkgs to automatically patchelf .so and elf executables with against one or more library paths. Unlike LD_LIBRARY_PATH, this can be done at ‘build’-time instead of runtime. Once I ran poetry, I’d autopatchelf all files in .venv to make sure the native dependencies are satisfied.
    One advantage of this method is that it can report about missing dependencies beforehand: when an elf dependency is missing and patchelf cannot find a library for it, the process can fail at ‘build’-time with an error about a missing .so file. Finding the files in packages with nix-locate is fairly easy to do.
    With this method it doesn’t matter whether the package installer is downloading prebuilt binaries or not, or whether it does so in a non-reproducible way, the resulting binaries will be patched anyway.
    One disadvantage of this method is that it only works after a fresh .venv. It won’t work when changing native packages, because the python packages were already patched. Rebuilding .venv is needed.

For the last method I made an experimental module for devenv.sh to evaluate it: python: add patchelf option by bobvanderlinden · Pull Request #840 · cachix/devenv · GitHub

I don’t think it’s already easy, but I think it’s a practical approach going forward. A working solution using poetry2nix would still be more ideal.

1 Like