Debugging Python Packages test phase

vanschelven · July 25, 2019, 5:56pm

(Warning: question follows after long introduction)
When creating new Python package definitions or debugging or upgrading existing ones, it often happens that the tests fail. It is then useful to debug these somehow.

To do this, I have found it useful to edit the derivation (temporarily – until the problems are resolved) in the following way:

turn off the checkPhase (given that the tests fail, this is required to install anything at all)
figure out what the command is to run the tests
figure out the location of the sources in the nix store. (this can be done by reading the build’s output)
promote any (test-related) nativeBuildInputs or checkInputs to propagatedBuildInputs (we need them to be able to rerun the tests manually later on)
Create a shell.nix like so:

with import <nixpkgs> {};

(pkgs.python3.withPackages (ps: with ps; [
  the-package-we-care-about
])).env

run nix-shell.
from inside the shell, debug the tests, by running the tests with an interactive debugger on failure. E.g. (for pytest) pytest --pdb

Optionally:

If the interactive debugger by itself is not sufficient, unpacking the full source from the nix store to an editable location, navigating to that location and running the tests from there may be useful.

And now for the question:

In the above, a number of steps involve the temporary manual editing of the expression we’re trying to get to run, only to revert such manual changes later. It would be nice if there was a utility, that could be applied to a package definition, and which would automatically do the following:

turn off checkPhase
make the actually used checkPhase easily printable, e.g. by introducing a command of a known name into the environment which prints it to screen)
make source location easily printable (in a similar way)
promote all nativeBuildInputs and checkInputs to propagatedBuildInputs
optionally: override src to point to a local checkout.

with import <nixpkgs> {};

(pkgs.python3.withPackages (ps: with ps; [
  debug-python-tests the-package-we-care-about
])).env

Unfortuntately, I’m not fluent enough in nix and the mechanisms for attr-overriding to come up with the definition for debug-python-tests. Any pointers are much appreciated!

deliciouslytyped · July 27, 2019, 4:20pm

+1 for anything that improves the debug cycle!

deliciouslytyped · July 29, 2019, 6:10pm

Useless nitpick: I think you meant (debug-python-tests the-package-we-care-about), you have to parenthesize function applications like that in a list otherwise it will think you have a list of a function and a package and you will get strange errors.

nix-repl> a = {}: 1

nix-repl> [ a {} ]
[ «lambda @ (string):1:2» { ... } ]

nix-repl> [ (a {}) ]
[ 1 ]

jorsn · April 30, 2020, 5:57pm

Edit: Simplest solution.

From man nix-shell(1):

The command nix-shell will build the dependencies of the specified derivation, but not the derivation itself. It will then start an interactive shell in which all environment variables defined by the derivation path have been set to their corresponding values, and the script $stdenv/setup has been sourced. This is useful for reproducing the environment of a derivation for development.

Hence, the nix-shell does also not perform the checkPhase or similar.

So just use a shell.nix like this:

(import <nixpkgs> {}).python3Packages.the-package-we-care-about

or, to change source:

with import <nixpkgs> {};
python3Packages.the-package-we-care-about.overrideAttrs (old: {
  src = the-source-we-want;
})

In the nix shell, the phases are in the respective shell variables, e.g. $installCheckPhase.

For Python, the checkPhase is empty, because buildPython(Package|Application) sets the installCheckPhase to the checkPhase if present and throws the latter away.

jonringer · April 30, 2020, 6:36pm

correct.

The best way that I debug failing tests is just to use the pytest test runner. It will capture all stdout, and when a test run fails, will dump stderr out. Generally you’re able to see the stacktrace of each failing test, and should give you a good indication if the tests are failing from the sandbox environment, invalid dependencies, or some other error.

If the tests fail because of the sandbox environment (tries to do network, access HOME, etc), it’s fine to disable them, however, some times give you indication that there’s a legitimate failure (they may try to issue a command to a path e.g. subprocess.check_call([ "/bin/bash", ... ]))