Tracking down differences between NixOS, Nix on linux and Nix on macOS

TLDR: nix-shell behaves differently on NixOS, Nix on Linux, Nix on maxOS. How can I track down the cause?

I am trying to use Nix to create reproducible development environments for a project written in Python.

This is a scientific research project, which leads it to having two perhaps slightly unusual requirements:

  1. There is no real distinction between users and developers, who frequently want to tweak the source code between executions of the code, so the standard way to install the software is in development mode with python setup.py develop. If it were 100% pure Python, no installation step would be needed at all beyond getting the dependencies into the environment, except that
  • the project does contain a small number of modules written in Cython
  • the software relies on a number of environment variables being set.
  1. We need the ability to go back to an old version (arbitrary git commit) of the software and run the code to generate results using exactly those versions of the dependencies which were being used at time the commit was made.

Hopefully the above points aren’t really relevant to my immediate problem, but I felt I should include them to give some context which might explain some things that look unusual in what follows.

As a first attempt, I have written a shell.nix in the project’s root that looks like this:

with import (builtins.fetchTarball {
  name = "nixos-20.03";
  url = "https://github.com/NixOS/nixpkgs/archive/20.03.tar.gz";
  sha256 = "0182ys095dfx02vl2a20j1hz92dx3mfgz2a6fhn31bqlp1wa8hlq";
}) {};

pkgs.mkShell {
  src = ./.;
  buildInputs = with python.pkgs; [
    cython
    numpy
    pandas
    etc, etc.
  ];

  shellHook = ''
    export SOME_VARIABLES_WE_NEED_AT_RUNTIME=whatever
    python setup.py build_ext --inplace
  '';
}

(Yes, I am aware of the development mode in buildPythonPackage, which
could/should replace the python setup.py build_ext --inplace in the
shellHook, but that has orthogonal issues which I don’t want to cause
distraction at the moment.)

The idea is that any developer, be it on NixOS or some other Linux or macOS with Nix installed, should be able to run and develop the software using some variations on the theme:

cd /path/to/project/checkout
nix-shell
pytest

I tried to test this on Travis, using a configuration which is essentially this:

language: minimal

os:
  - linux
  - osx

install:
  - sudo curl -L https://nixos.org/nix/install | sh
  - . $HOME/.nix-profile/etc/profile.d/nix.sh

script:
  - nix-shell --run 'HYPOTHESIS_PROFILE=travis-ci pytest -v'

(Yes, I’m aware that Travis provides language: nix: it is broken at the moment.)

as well as on my local machine which runs NixOS, and observe the following results

  • Travis linux: all the project’s tests pass
  • Travis macOS: lots of tests fail
  • Local NixOS: exactly the same tests fail as on Travis macOS

Given that the nix shell in which these tests are executed uses a very specific version of nixpkgs, how should I proceed in trying to understand and eliminate these differences in behavour?

1 Like

The success on Travis Linux is likely related to impurities from the host operating system. My first suggestion would be to try with nix-shell --pure, so that any environment variables not defined by shell.nix are unset. That should hopefully get the tests to fail in the same way as on NixOS and macOS. Beyond that, there are a number of avenues to follow:

  • Making sure that the resulting derivation is the same (this won’t be the case on macOS) — just using nix-instantiate shell.nix in both environments and comparing the hash. If this isn’t the case, the nix expressions themselves have impurities such as references to environment variables, reading files, or paths imported into the nix store — try grepping for getEnv, readFile, and readDir. If paths are imported into the nix store, you can track this down by running nix-instantiate with -vv and looking for copied source in the output.
  • Making sure that nothing from the host OS gets run — wrapping your nix-shell invocation in strace -fe file -o >(grep usr) might be a good start here.
1 Like

The failures were caused by Git LFS. Git LFS was not installed on the NixOS machine or Travis OSX.

  • On my personal NixOS machine, the solution was to enable Git LFS in home-manager:

    programs.git.lfs.enable = true;
    
  • On Travis OSX I added the following to .travis.yml:

    before_install:
      - if [ "$TRAVIS_OS_NAME" = "osx" ]; then brew install git-lfs; fi
      - if [ "$TRAVIS_OS_NAME" = "osx" ]; then git lfs install;      fi
    
    before_script:
      - if [ "$TRAVIS_OS_NAME" = "osx" ]; then git lfs pull; fi
    
  • On Travis Linux Git LFS is installed by default, which is why everything worked there in the first place.

At this stage, all three systems appear to be behaving identically, so Nix has done its job: the problem lay elsewhere. Well done Nix! Sorry about the noise, but hopefully something here will be of some use to someone.

Thank you, @lheckemann, for your suggestions.

Doesn’t that mean that git LFS should be somehow part of that nix build environment instead of an external dependency? Maybe in a two-step process, if it doesn’t work directly. Fetching the sources reliably is just as important for reproducibility as the build.

Very good point.

I was too focused on the “I’ve been hacking on this in my git clone and want to compare the results to <some commit>, so I need the correct corresponding dependencies” aspect. In that context git-lfs must have been working already.

But you’re absolutely right, eventually I will also want to use this to deploy onto production machines, and then having Nix ensure that git-lfs is properly working, will be important.

@jacg This is working again.