Data Science on nixos nix, poetry, pip, mach-nix, pynixify: - all fail?

regarding mach-nix:

(on NixOS 20.03) why do I get Python 2.7 from home-manager
if I execute:

nix-shell shell_mach-nix__test.nix --show-trace
[nix-shell:] which python 
/home/user/.nix-profile/bin/python
# [nix-shell:} ls -lah /home/user/.nix-profile/bin/python
lrwxrwxrwx 1 root root 72 Jan  1  1970 /home/user/.nix-profile/bin/python -> /nix/store/5rzsb9q7v4qzbgwz3pp6kfzkf48kyzq8-home-manager-path/bin/python
[nix-shell:] python --version
Python 2.7.18

py27

python points to python2.7 for historical reasons. If you want to reference python3, then you must specify python3

good evening jonringer,

the link / config is

python = pkgs.python38;

then I’m not sure, :frowning:

The problem here was that the python derivation was loaded directly into nix shell instead of being wrapped in a mkShell.

I have to say I think it is a actually a really weird behaviour of nix-shell to not build the derivation by default. I know that nix-shell was original invented to debug builds and therefore not building the main derivation made sense.

But by now I assume that 95% of all uses of nix-shell (if not even more) are to load normal environments.

So why don’t we change the defualt behaviour of nix-shell to actually load derivations, while still allowing to use the build debug mode by passing --debug-build?

I’m very sure that this would have saved me a lot of time and confusion when I started using nix.

Isn’t it a bit strange that you always need to rewrite/extend your expression before you can use it in a shell?

3 Likes

I have found jupyterWith to be reliable for data science work on nixos.
It works with venv for pip packages that aren’t packaged on nix too.

https://github.com/tweag/jupyterWith

The other approach that is really nice is from the nixpkgs manual, in section:

15.17.3.6. How to consume python modules using pip in a virtual environment like I am used to on other Operating Systems?

Which shows how to combine nix python packages with a venv in which you can use pip to install other packages.

2 Likes

Hello aaron.ash,

if you are using the impure pip venv, isn’t it then easier to use conda on ubuntu (or fedora) instead (so you could work with 99% of the features of JupyterLab)?

I have not found a nice example of withJupyter running properly with JupyterLab extensions and packages not in nixpkgs included.
Would be great if you could link one.

This talk shows some examples of how conda isn’t sufficient for a reproducible environment Using Nix for Repeatable Python Environments | SciPy 2019 | Daniel Wheeler - YouTube

2 Likes

We tried using conda and it was a total disaster. The shared libraries it adds in front of LD_LIBRARY_PATH broke all kinds of tools from the host OS (RHEL6 at the time). Adding channels for packages not available in the default repository would trigger massive downgrades of other packages. Then I found Nix and got rid of conda as fast as I could :wink:

3 Likes

I haven’t really needed to use any of the JupyterLab extensions yet but the jupyterWith documentation does cover that: GitHub - tweag/jupyterWith: declarative and reproducible Jupyter environments - powered by Nix

My experience with conda has been similar to @jonringer and @alexv. We tried it at work on a mix of windows and fedora machines and it was incredibly slow and unreliable and effectively drove us away from jupyter for most things.

1 Like

with conda in an built env I never had issues with the performance
(there could be windows more likely to be blamed :slight_smile:

But like the video demonstrated, in conda the sequence is key (and pinning packages and channels [before installation]-> makes it quite stable :slight_smile: - because of that I used it in docker.
In my case the solver took hours and later in conda 4.x more than a day or failed (even for already built envs).

For me working with Jupyter without extensions would be like working with VScode without extensions…
I saw the documentation (which is quite old in some points) and tried some things out but for now I don’t see too much benefit from withJupyter (to get an advanced Data Science environment built)
… will try some more

Hosted by Flying Circus.