Overriding torch with torch-bin for all packages

Hello all,

I’m new to Nix, so I am likely doing something wrong, but allow me to articulate what the issue is.

I am using rules_nixpkgs, to try and install a number of Torch related packages and make them available to my Bazel build environment. Specifically, I would like all of them to depend on torch-bin, rather than torch, since this actually has CUDA support.

My initial attempt was this:

nixpkgs_python_configure(
repository = “@nixpkgs”,
python3_attribute_path = “”"
python310.withPackages(ps: with ps; [
torch-bin
(detectron2.override { torch = python310.pkgs.torch-bin; })
(optuna.override { torch = python310.pkgs.torch-bin; })
(pytorch-lightning.override { torch = python310.pkgs.torch-bin; })
(torchmetrics.override { torch = python310.pkgs.torch-bin; })
(torchvision.override { torch = python310.pkgs.torch-bin; })
(torchaudio.override { torch = python310.pkgs.torch-bin; })])
“”",

However, this resulted in conflicts.

I then tried something a little more sophisticated (which I am unsure if correct or not):

nixpkgs_python_configure(
repository = “@nixpkgs”,
python3_attribute_path = “”"
python310.withPackages(ps: with ps; [
(python310.pkgs.override {
overrides = self: super: {
torch = python310.pkgs.torch-bin;
};
})

    detectron2
    optuna
    torchvision
    torchmetrics
    torchaudio
    pytorch-lightning

This completes without error, however it seems that torch is still being used. The reason I know this is because torch cannot see my GPUs:

In [4]: torch.cuda.is_available()
Out[4]: False

But, if I add torch-bin to the list of things to install, I get these errors:

building ‘/nix/store/iysal0lz3hs1xyi45niwsq94i6iaw8aa-python3-3.10.13-env.drv’…
error: collision between /nix/store/5iprrznhmw6ma6fgkgvjr37cmvinmvr7-python3.10-torch-2.0.1/lib/python3.10/site-packages/torchgen/dest/__pycache__/register_dispatch_key.cpython-310.pyc' and /nix/store/55ivn06v4xr0hkvdxif743x23fwcz7lc-python3.10-torch-2.0.1/lib/python3.10/site-packages/torchgen/dest/pycache/register_dispatch_key.cpython-310.pyc’
error: builder for ‘/nix/store/iysal0lz3hs1xyi45niwsq94i6iaw8aa-python3-3.10.13-env.drv’ failed with exit code 255

However, removing all other packages and just installing torch-bin succeeds, and torch sees my GPUs as available:

In [1]: import torch

In [2]: torch.cuda.is_available()
Out[2]: True

Any ideas? I’m using Ubuntu 22.04 in WSL, with nix version 2.18.1 with the 23.11 package set.

Hi! Just a few quick comments:

  • You can use torch with CUDA support on, if you instantiate your nixpkgs as

    import nixpkgs { config.cudaSupport = true; /* allowUnfree, cudaCapabilities, etc */ }
    

    Generally speaking, this is more reliable than torch-bin

  • If you want to substitute torch with torch-bin globally, you can try

    import nixpkgs {
      overlays = [(final: prev: {
        pythonPackagesExtensions = [(py-final: py-prev: {
          torch = py-final.torch-bin;
        })];
      })];
    }
    
  • To diagnose the torch.cuda.is_available() issue try grepping LD_DEBUG=libs python -c "import torch ; torch.cuda.is_available()" for libcuda, or strace python -c ... for /dev/nvidia (feel free to follow up in this thread with what you observe)

Cheers

1 Like

Thanks. I’ll check on the latter stuff when I’m back in the office tomorrow.

With regards to using Torch with cuda support enabled, does this require us to build Torch ourselves? That was my main reason for wanting to use torch-bin.

Hi. There’s currently a cache used for development and testing, available at Cachix - Nix binary cache hosting. Cf. On nixpkgs and the "AI" (follow-up to 2023 Nix Developer Dialogues) for more context