Mach-nix: Create python environments quick and easy

Released conda-beta (21 Sep 2020)

Conda provider, Conda parser

TL;DR;

This version introduces a new provider for conda packages and the capability of parsing conda’s environment.yml file. Those are independent features, meaning you can access packages from conda without using the conda environment.yml format, or you can build from an environment.yml file while taking packages from pypi or nixpkgs.

The default provider order is updated to: conda,wheel,sdist,nixpkgs
The provider conda is a shorthand combining two conda channels conda/main,conda/r.

Other conda channels are also available. Just use the channel name as a provider, like:
conda-forge,wheel,nixpkgs

If a conda channel is not yet known by mach-nix, add it via condaChannelsExtra

System dependencies of conda packages are satisfied via the anaconda repo. Those system libraries will take up additional closure space. Therefore the conda provider might not be the optimal choice for container scenarios.

Motivation

Obviously it comes in handy being able to transfer conda based environments to nix. But also people, who never used conda before, might benefit from an improved user experience thanks to this update.

I’d like to mention a few important advantages of the conda ecosystem, which make me believe, the conda provider should become the default provider for future stable versions of mach-nix.

Convertability

Compared to pypi, conda packages/environments seem to be much easier to convert to nix. All meta data of packages (dependencies, etc.) are provided by anaconda via a single file per each repo called repodata.json. This file contains all information necessary to convert conda requirements to nix expressions.
Another important factor is, that conda packages declare their non-python dependencies.

Build variants

Of course nothing can compete with nixpkgs in terms of flexibility, but python packages from nixpkgs create a major problem in conjunction with mach-nix. Since mach-nix always changes some dependencies/attributes, every package must be re-built locally which is not much fun for larger packages.
Like nixpkgs, conda provides different build variants for some packages and those are cheap to install.

Non-python dependencies

Large binary packages from pypi are difficult to install. They are likely to break due to unfulfilled dependencies or wrong dependency versions.
The problem is that the manylinux wheel standard cannot be fulfilled by many large packages, since they require non-python dependencies which are not available from pypi.

Fulfilling non-python dependencies for conda packages, on the the other hand, is easy, since those are fully declared and available from anaconda.

Independent from my infrastructure

Updating the databases for pypi releases requires constant crawling of meta data.
If I stopped running these crawlers, mach-nix users would be stuck with an outdated version of pypi metadata until someone decides to rehost the infrastructure (which is open sourced via nixops template).
For conda, mach-nix just requires repodata.json files. Those can be downloaded from anaconda anytime and do not require any extra infrastructure to maintain.

aarch64 support

It seems like some community maintained conda channels, like for example conda-forge, provide a lot more binary releases for aarch64 than pypi does.

Examples

Import mach-nix with conda support

let mach-nix = import (builtins.fetchGit {
  url = "https://github.com/DavHau/mach-nix";
  ref = "refs/heads/conda-beta";
}) {}; in
...

Build conda environment defined via environment.yml

... # import
mach-nix.mkPython {
  requirements = builtins.readFile ./environment.yml;
}

Select build variants and channels

... # import
mach-nix.mkPython {
  requirements = ''
    tensorflow >=2.3.0 mkl*
    blas * mkl*
    requests
  '';
  providers.requests = "conda-forge";
}

Include extra conda channels

Channels added via condaChannelsExtra are automatically appended to the default providers. This example uses impure fetching for simplicity. It’s better to manually download the repodata.json files and reference them locally.

let mach-nix = import (builtins.fetchGit {
  url = "https://github.com/DavHau/mach-nix";
  ref = "refs/heads/conda-beta";
}) {
  condaChannelsExtra.bioconda = [
    (builtins.fetchurl "https://conda.anaconda.org/bioconda/linux-64/repodata.json")
    (builtins.fetchurl "https://conda.anaconda.org/bioconda/noarch/repodata.json")
  ];
}; in
mach-nix.mkPython {
  requirements = ''
    kraken2
  '';
}
8 Likes