Data Science on nixos nix, poetry, pip, mach-nix, pynixify: - all fail?

Hello,

I try to set up a environment for data science.

But in nixos/ nixpkgs the are not all python-libraries existing.
→ so a nixos-style pure env is impossible?

if the packages are not available in nixpkgs, I wouldn’t expect that poetry2nix could solve that issue, can it?

while evaluating anonymous function at /nix/store/dzk0krx7hylcm14wcbhpmqw7pi0z6ll3-nixpkgs-20.03.2652.076c67fdea6/nixpkgs/pkgs/development/tools/poetry2nix/poetry2nix/mk-poetry-dep.nix:132:25, called from undefined position:

error: --- EvalError ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- nix-shell
in file: /nix/store/dzk0krx7hylcm14wcbhpmqw7pi0z6ll3-nixpkgs-20.03.2652.076c67fdea6/nixpkgs/pkgs/development/tools/poetry2nix/poetry2nix/mk-poetry-dep.nix (132:28)

attribute 'typing_extensions' missing

So I tried mach-nix and it fails as well:

resolvelib.resolvers.ResolutionImpossible: [RequirementInformation(requirement=Requirement.parse('dask==2.21.0'), parent=None), RequirementInformation(requirement=Requirement.parse('dask[complete]>=0.18.0'), parent=Candidate(name='datashader', ver=<Version('0.10.0')>, ex
tras=()))]
builder for '/nix/store/gjpfzy43v401p5v3y840wwjlqijxh615-mach_nix_file.drv' failed with exit code 1
error: --- Error --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- nix-build
build of '/nix/store/gjpfzy43v401p5v3y840wwjlqijxh615-mach_nix_file.drv' failed

Use the latest poetry2nix version from GitHub - nix-community/poetry2nix: Convert poetry projects to nix automagically [maintainer=@adisbladis] . If it still doesn’t work, show your code so we can help

[tool.poetry]
name = "nixfriday_poetry"
version = "0.1.0"
description = "test_nix_poetry"
authors = ["TL"]
license = "MIT"

[tool.poetry.dependencies]
python = "^3.7"
flask = "^1.1.2"
pandas = "^1.1.0"
cython = "^0.29.21"

[tool.poetry.dev-dependencies]
pytest = "^6.0.1"

[tool.poetry.scripts]
#nixfriday = 'NixFriday_poetry:main'

[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"
{ pkgs ? import <nixpkgs> { } }:

with pkgs ;
let src = fetchFromGitHub {
  owner = "nix-community";
  repo = "poetry2nix";
  rev = "270a0b26b773e566ad59927c51d40a5e9b8ff08d";
  sha256 = "0yw8vdwgqw2y3mpyya9gy1l93115yxnv5yamr31nfvzhwkgj9qz5";
};
in
with import "${src.out}/overlay.nix" pkgs pkgs;
let pythonEnv = poetry2nix.mkPoetryEnv {
  python = python3;
  poetrylock = ./poetry.lock;
};
in
mkShell {
  name = "example";
  nativeBuildInputs = [
    pythonEnv
    poetry    
  ];
}
Using pythonImportsCheckPhase
ERROR: Could not find a version that satisfies the requirement atomicwrites>=1.0 (from pytest==4.6.11) (from versions: none)
ERROR: No matching distribution found for atomicwrites>=1.0 (from pytest==4.6.11)
unpacking sources
unpacking source archive /nix/store/q270fmlhskh3fjhgzc1j1kxlyvq10bdv-Flask-1.1.2.tar.gz
source root is Flask-1.1.2
builder for '/nix/store/w5bzqr5zyiwmmdgcgcvppsbfnnic9zqk-python3.7-pytest-4.6.11.drv' failed with exit code 1
cannot build derivation '/nix/store/yhdpvb9ws28fidly6cl3v61wskkc19bf-python3.7-intreehooks-1.0.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/m78b4643a1g94g53qi1d3rsmfd10c7av-poetry-1.0.10.drv': 1 dependencies couldn't be built
error: --- Error --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- nix-shell
build of '/nix/store/24d47a7vn4wv3bzfcq1falvsxya9mkn7-python3-3.7.6-env.drv', '/nix/store/m78b4643a1g94g53qi1d3rsmfd10c7av-poetry-1.0.10.drv' failed



This isn’t the latest version, the commit f394798d72ceeb53485a1a0b7ac1bfb31983dc79 you’re using is from January

sorry, to much copy and change …
now is should be right

  • and the error changed to another one

Works with a recent nixpkgs, e.g. try

{ pkgs ? import (fetchTarball "https://github.com/NixOS/nixpkgs/tarball/nixos-unstable") { } }:
1 Like

you are right, it works.
(the thing is, that it would take a week to build a complex Data Science if it had to be compiled locally first)

  • any suggestion?

I just want to caution that a lot of machine learning frameworks are difficult to get working correctly even on a normal FHS installation. Doing so on nixos is another step of potential pain. (E.g. tensorflow is broken on nixpkgs constantly)

that is the reason why I’m trying nixos… (besides broken tensorflow … )

  • conda doesn’t resolves the env any more
  • pip fails do to version constraints

welcome to python, when stuff works, it barely works, and when stuff is broken, it’s probably broken in many places.

The python packages on nixpkgs have had a lot of handwork go into them to make them “more coherent”. Unfortunately, the python ecosystem (with very few exceptions) is in a state of constant turbulence and breakages.

4 Likes

I maintain a consistent python environment for data scientists so they don’t have to. I use the buildEnv approach to create a derivation with all the packages they request which I then install in the default Nix profile (I use the single user Nix setup with a dedicated “admin” user). If a package in not available in nixpkgs I create a derivation for it using python-package-init. I also use overlays for tweaking parameters of the packages (MKL, GPU, etc.) and patching broken packages. Every 2-3 months I move the pinned nixpkgs to the latest unstable and rebuild the environment. Such an approach makes the versions consistent for all the teams/projects and the data scientists don’t have to learn nix.

7 Likes

This is probably the sanest thing to do

not familiar with it, what does it do?

There are a few of us in a slack channel sometimes exchanging tips.

https://join.slack.com/t/nix-data/shared_invite/enQtOTYyNjQzMDA0ODgzLTM3ZDE3Mjg3ZWI2N2ExOTNkOGYzMWU2NTEzZmIyNTE0OTg1NDZlZjYzM2JjNDA4ZTg5MTg3ZWE4NzM5NzI2NzQ

Is this the same group as in Workgroup:DataScience - NixOS Wiki
? Should I add the slack link there?

I built mach-nix assuming that python packages don’t have circular dependencies. Your examples have proven me wrong. This is now handled in version 2.2.1 together with another python 3.8 related bug.

Now your build should succeed.

Its the following tool written by @costrouc that generates a Nix expression from a PyPI package GitHub - nix-community/nixpkgs-pytools: Tools for removing the tedious nature of creating nixpkgs derivations [maintainer=@costrouc]

1 Like

Circular dependencies are currently quite problematic because Nix requires a DAG. To solve this issue, we should longer-term install a Python package in an output, but perform wrapping/patching to other Python packages in a second derivation (the one creating a Python environment).

2 Likes

That would be cool! I was thinking the same.
This would also improve caching effectivity.

Currently, if one runtime dependency changes, the package needs to be rebuilt. If we would add runtime deps only in a final wrapping, like you proposed, then cache invalidation is less likely to happen.
Mach-nix would benefit a lot from this, as one little change of a single package often leads to a mass rebuild of many other packages.

Reducing these “side effects” of runtime deps might allow to create a cache for mach-nix which caches a few different versions of the most common base dependencies and allow a faster build experience.

1 Like

Interesting, I’m making a similar tool which tries to do this for any package type: GitHub - jonringer/nix-template: Make creating nix expressions easy

tested pynixify via requirement file but (for now) it fails
build error

Hosted by Flying Circus.