Mach-nix: Create python environments quick and easy

Released 3.1.0 (27 Nov 2020)

flakes lib, cli improvements, bugfixes

Features

  • expose the following functions via flakes lib:
    • mkPython / mkPythonShell / mkDockerImage / mkOverlay / mkNixpkgs / mkPythonOverrides
    • buildPythonPackage / buildPythonApplication
    • fetchPypiSdist / fetchPypiWheel
  • Properly manage and lock versions of nixpkgs and mach-nix for environments created via mach-nix env command.
  • Add example on how to use mach-nix with jupyterWith

Improvements

  • Improve portability of mach-nix env generated environments. Replace the platform specific compiled nix expression with a call to mach-nix itself, which is platform agnostic.
  • Mach-nix now produces the same result no matter if it is used through flakes or legacy interface. The legacy interface now loads its dependencies via flakes.lock.

Fixes

  • mkDockerImage produced corrupt images.
  • non-python packages passed via packagesExtra were not available during runtime. Now they are added to the PATH.
  • remove <nixpkgs> impurity in the dependency extractor used in buildPythonPackage.
4 Likes

Released 3.1.1 (27 Nov 2020)

fix cli

Fixes

  • Fix missing flake.lock error when using mach-nix cli.
3 Likes

Released 3.2.0 (11 Mar 2021)

bugfixes, ignoreCollisions

Features

  • add argument ignoreCollisions to all mk* functions

  • add passthru attribute expr to the result of mkPython, which is a string containing the internally generated nix expression.

  • add flake output sdist, to build pip compatible sdist distribution of mach-nix

Fixes

  • Sometimes wrong package versions were inherited when using the nixpkgs provider, leading to collision errors or unexpected package versions. Now, python depenencies of nixpkgs candidates are automatically replaced recursively.

  • When cross building, mach-nix attempted to generate the nix expression using the target platform’s python interpreter, resulting in failure

Package Fixes

  • cartopy: add missing build inputs (geos)

  • google-auth: add missing dependency six when provider is nixpkgs

5 Likes

I implemented an alternative way of fetching python packages that doesn’t require an index or a dependency database. It is a fixed output derivation which just uses pip. Reproducibility is ensured by a local proxy that filters pypi.org responses via date, to provide a snapshot-like view on pypi.

The Disadvantages I see:

  • all packages have to be re-downloaded after each change in requirements
  • outputHash needs to be updated after each change in requirements

The Benefits I see:

  • doesn’t require to store big index files (They can be annoying as flake inputs etc.)
  • dependency resolution is exactly like in pip.
  • doesn’t require to maintain the resolver/crawlers

I case you’re interested, have a look at the nixpkgs PR: fetchPythonRequirements: init (fixed output pypi fetcher) by DavHau · Pull Request #121425 · NixOS/nixpkgs · GitHub
It includes an example for jupyterlab

Maybe a tool could be built around this which allows similar comfort like mach-nix, but without a lot of its complexity. (packages could somehow be cached locally, to solve the re-downloading issue, etc.)

Not having to maintain the resolver and pypi crawlers would be a big game changer I think.
Just using pip directly is a lot easier than trying to imitate its behavior.

Let me know about your thoughts.

9 Likes

This is awesome! Great work! This is inspiring me for other downstream use-cases, specifically, integrating third-party pip dependencies with Bazel.

Recently I have undertaken some major refactoring of the crawler architecture which is about to be finished.
This happened outside the mach-nix repo, namely in pypi-deps-db and nix-pypi-fetcher.
As you may know, mach-nix depends on both these projects being updated regularly to be able to compute dependency graphs and fetch packages reproducibly.

The motivation behind the changes were:

  • improve maintainability
  • add data for python 3.9 and 3.10
  • simplify the process of introducing new python versions
  • remove any non public infrastructure parts
  • remove the requirement of trusting in me hosting the crawlers
  • make it easy for people to fork and maintain their own data

The following changes have been made:

  • remove the requirement of an SQL database. All update cycles now operate directly on the json files contained in the repo.
  • both projects contain a flake app that updates the data on a local checkout.
  • both projects contain a github action cron job that updates the data regularly
  • python versions can be added / removed by slightly modifying the flake.nix
  • a new directory ./sdist-errors is added to pypi-deps-db, containing information about why extracting requirements of a specific sdist package failed.

If the projects are forked on github, the data should continue to update itself without further interaction as the workflow file will be forked with the project.

On any non-gitub CI system it should be as simple as installing nix with flakes and then executing the included flake app regularly to keep the data updated.

The newest version of pypi-deps-db now supports python 3.9 and 3.10 while 3.5 was removed.
I still kept python 2.7 despite it being EOL. My gut tells me there is still too much software around depending on it. Does anybody still need 2.7?

5 Likes

Released 3.3.0 (22 May 2021)

bugfixes, improvements

Changes

  • The flakes cmdline api has been changed. New usage:
    nix (build|shell) mach-nix#gen.(python|docker).package1.package2...
    
    (Despite this change being backward incompatible, I did not bump the major version since everything flakes related should be considered experimental anyways)

Improvements

  • Mach-nix (used via flakes) will now throw an error if the selected nixpkgs version is newer than the dependency DB since this can cause conflicts in the resulting environment.
  • When used via flakes, it was impossible to select the python version because the import function is not used anymore. Now python can be passed to mkPython alternatively.
  • For the flakes cmdline api, collisions are now ignored by default
  • The simplified override interface did not deal well with non-existent values.
    • Now the .add directive automatically assumes an empty list/set/string when the attribute to be extended doesn’t exist.
    • Now the .mod directive will pass null to the given function if the attribute to modify doesn’t exist.

Fixes

  • Generating an environment with a package named overrides failed due to a variable name collision in the resulting nix expression.
  • When used via flakes, the pypiData was downloaded twice, because the legacy code path for fetching was still used instead of the flakes input.
  • nix flake show mach-nix failed because it required IFD for foreign platforms.
  • For environments generated via mach-nix env ... the python command referred to the wrong interpreter.
  • When checking wheels for compatibility, the minor version for python was not respected which could lead to invalid environments.
  • Some python modules in nixpkgs propagate unnecessary dependencies which could lead to collisions in the final environment. Now mach-nix recursively removes all python dependencies which are not strictly required.

Package Fixes

  • cryptography: remove rust related hook when version < 3.4
6 Likes

Excellent work, mach-nix was the easiest to feed it requirements.txt and get a Nix environment.

1 Like

I’m currently experimenting with python’s import system.
The goal is to allow packages to have private dependencies which are not propagated into the global module scope. If this works, we could build python environments containing more than one version of the same library. This in turn would make dependency resolution trivial/unnecessary and could solve the patching madness in nixpkgs.

I somehow cannot believe that nobody ever tried this, but I could not find such attempts online. In case anybody knows about such attempts or has any input regarding this, I’d appreciate it.

7 Likes

@costrouc looked into this in the past

1 Like

@DavHau this is the specific code that does the import rewrites on files https://github.com/nix-community/nixpkgs-pytools/blob/70c7b9db33ea5e31d35d0b67c9171757e4d74bd0/nixpkgs_pytools/import_rewrite.py. There are some shortcomings of this approach. Mainly that it can’t touch shared libraries to do the rewrites and I wrote a few others in things that @FRidh linked. The approach seemed pretty robust though when I tried it out with packages.

1 Like

Thanks for that. I have actually taken a look into your approach @costrouc a while ago and it was definitely inspiring. I forgot to mention that earlier. Now I am planning to implement something that doesn’t require any modification of library code.

My current idea is to replace builtins.__import__ which is called on every import. This new import function would then inspect the callers location. Depending from which location it is called, it chooses from a different set of dependencies (every package would bring its own site-packages). Each imported module would get a new unique name in sys.modules to prevent clashes.

My goal is to get rid of sys.path/PYTHONPATH completely and only use the new style of packaging/importing.
The system would be smart enough to detect if two modules depend on the same version and only instantiate that module version once.

In the last few days, I have already implemented something similar via importlib’s PathFinder and FileFinder. But that was too hacky and fragile. Later I found out it is possible to just override builtins.__import__. This should make it easier.

So now I’m starting over with a blank page and thought I reach out to you guys first, to prevent ending up in another If-I-had-only-known situation.

I haven’t seen the discussion on the python forum so far. That is definitely interesting.

3 Likes

I do monkeypatch __builtins__.__import__ in resholve, though it may not be a helpful reference since I’m using it to force a namespace on the Oil shell’s python2.7 codebase (and not for Nix/nixpkgs-specific reasons): https://github.com/abathur/resholve/blob/591ae30b839d0bae6f02ec7abd852004c6acbbb8/resholve#L23-L130

1 Like

Conda support is now merged into master. By default it is disabled, but can be enabled by adding conda to the providers like this for example:

proviers._default = "conda,wheel,sdist,nixpkgs"

or by passing a requirements.yml content to requirements which will automatically enable the provider.

Supporting conda required me to implement a custom requirements parser, as the new format allows both pip and coda formats and even mixes of these.
Therefore, even if you don’t use conda, there is a chance that you might discover a bug. There is extensive unit testing on these changes, but edge cases could still come up. I might leave this on master for a bit longer and see if anything gets reported by you guys.

3 Likes

Awesome! What should one pass as condaChannelsExtra to enable conda-forge?

conda-forge is included by default. Just add it to the providers (like "conda-forge,conda,nixpkgs"). But good point, docs are missing for some of the new stuff ;). In the mean time, scroll up this topic to when conda beta was released. There are some examples included.

1 Like

Is there any way to do mach-nix and have the resulting env use a modified or custom python. Something like python = (enableDebugging pkgs.python39); in the call to mkPythonShell? I want to set up an environment where I can use cygdb in cython and that requires a debugging python.

1 Like

You can make an overlay for nixpkgs which replaces python39 with your debugging python. Then import nixpkgs with that overlay and pass it to mach-nix during import via pkgs argument. If you need further assistance, feel free to open an issue on github.

1 Like

Released 3.4.0 (04 Feb 2022)

aarch64-darwin, nixpkgs provider improvements, bugfixes

Features

  • support conda packages
  • support wheels for apple m1 (aarch64-darwin)

Changes

  • remove support for installing mach-nix via pip
  • updated inputs nixpkgs, pypi-deps-db, conda-channels

Improvements

  • support for PEP600 wheels of format: ‘manylinux_${GLIBCMAJOR}_${GLIBCMINOR}’
  • respect ‘python_requires’ for sdist packages
  • PEP440 compatible pre-release version handling
  • improve handling of python packages from nixpkgs

Fixes

  • fix problem where required dependencies were removed from nixpkgs python modules.
  • prevent package collisions with dependencies from packagesExtra
  • fix version comparison of versions with arbitrary length
  • various fixes for MacOS
  • various fixes for requirements parsing
  • various other fixes

Package Fixes

  • libwebp-base: remove colliding binaries in conda package
  • pyqt5: fix missing wrapQtAppsHook
5 Likes

I am looking for ifcopenshell nixpkgs build method.
occur error collision between python3-x.x.x-env/bin/idle

I’m having a hard time using the latest releases of ifcopenshell.
Both are a bit complicated, but the derivations I use are each:

(python39.withPackages (py-packages: with py-packages; other-modules ++ [ ifcopenshell ]))

and
This is just a wish…
like that other-modules list
I wish mach-modules could be easily removed and pasted.

like,
(python39.withPackages (py-packages: with py-packages; other-modules ++ [ ifcopenshell ] ++ mach-modules ))

In particular, wouldn’t it be better to subtract them as nixpkgs instead of unifying the requirements?