Comparing Nix and Conda

I concur. Having something like /opt/nix is much, much more palatable than /nix for the traditional IT but this has been brought up before even for non-HPC cases. For instance I have a mac which is managed centrally so I can’t use the Catalina workarounds to install Nix because I can’t create volumes.

I don’t have any experience with Singularity. It still requires good relationship with administrator, doesn’t it? [“Singularity must be installed as root to function properly.” though these aren’t the most recent docs, perhaps.] Or have you found that it’s easier to persuade administrators to install Singularity than to install Nix?

Do you have anything noteworthy to report on your experience with Singularity?

Moreso that singularity is very commonly installed on clusters, as it doesn’t require a docker daemon as root (which apparently is a security concern). I don’t particularly like singularity apart from its availability.

They typically use a module system like lmod, which allows you to switch between different software environments. Many commonly-used scientific packages are provided through the module system as well as newer Python versions, etc.

Rust is easy ;), cluster nodes typically have a shared home directory. So, you can just install Rust with rustup as is common on non-NixOS systems (and you only need one system/node anyway, since it is often easy to deploy Rust binaries). At any rate, I don’t think Rust has much uptake in HPC/science yet.

The module system is very clunky, but at least they offer offer MKL, CUDA, Intel Compilers, etc. and libraries compiled against those, out of the box.

I have left academia 6 months ago. But when we would run something on a HPC cluster or Grid, we would just build the software on an old CentOS version (whatever version is supported). On larger clusters, like the European E-Science Grid, you can write a job specifications where you can specify a CentOS/Scientific Linux version, and query how many CPUs are available with that OS version. Luckily, the last group I worked in had the funding to buy large machines for ourselves, so I was relieved from old CentOS versions and could install Nix.

At any rate, HPC clusters are mostly Linux like it was 5-10 years ago, and most people just accept that. For some reason, those maintaining clusters are very conservative. I guess at least stuff doesn’t break very often.

I agree though that Nix would be a much better choice, if we improved our scientific software story.

7 Likes

I don’t think this will help much in HPC, where /opt typically contains software shared between machines through e.g. NFS or AFS. AFAIK multi-user Nix does not work in that scenario. Maybe not everywhere, but /opt was always on some networked storage where I have worked.

The holy grail is allowing arbitrary install paths, since then people could just install Nix into their home directories as long cluster admins do not do cluster-wide installs. As far as I know, CentOS 8 (in contrast to CentOS 7) enables user namespaces by default, so there is hope!

3 Likes

Concerning Singularity and Nix:

1 Like

Apparently this is possible now? No clue how badly it will mess up nixpkgs.

5 Likes

Apparently this is possible now? No clue how badly it will mess up nixpkgs.

I think the point is to have that while still using the binary cache? Otherwise it looks like people generally succeed whenever they invest a week or two into figuring out the details, even without the now-available static Nix.

Of course, if non-privileged access to user/mount namespaces are available, there is no problem with this (and apparently the availability is still getting wider)

1 Like

I think that saving them that week or two by providing a switch, is very worthwhile.

That does seem to be a more expensive problem, so a solution to this one is even more important. IIUC, the default binary cache is tied to a fully-qualified location of the nix store.

  • Will this be true for any binary cache? Or can a cache be configured with a relative path?
  • If full-qualification is unavoidable, is there some fully-qualified location that might be usable by at least a large proportion, if not all, of people in this situation?

The point being to provide a binary cache for a non-standard nix store location, that can be used by the greatest number of users on HPCs-without-admin-rights.

2 Likes

Hmm, and how about a proxy that receives requests for a non-standard location, and forwards them to to the standard cache (or maybe even an arbitrary cache) with a translation of the path?

The problem is that you cannot just overwrite paths in binaries. If the new path is longer than the embedded path, you will overwrite other data. Also, paths may be stored with their lengths (rather than 0-delimiting).

One could create a shim for glibc that rewrites paths for all path-based functions (open, etc.), but then you run into the problem that some programs use syscalls directly (e.g. Go programs).

I agree with @7c6f434c that namespaces are the ultimate solution. Move the store to a more acceptable location such as /opt/nix or somewhere in /var (to make the path more acceptable for global installs) and then use namespaces for unprivileged installs. Probably something like pam_namespace can be used to set up a namespace when the user logs in.

2 Likes

I think that saving them that week or two by providing a switch, is very worthwhile.

Well, it is sometimes hard to separate how much time is for general learning Nix there, switch will also need to be documented and will not remove the need to learn Nix.

But apparently now it is easier to get started, which is indeed nice

  • Will this be true for any binary cache? Or can a cache be configured with a relative path?

Well, people use absolute paths defined at compile time in programs. Sometimes things get compressed, so it is not a textual substitution. For various reasons various people hope to estimate just how bad it is, and probably it is not always a huge problem, but there is risk. I guess an opportunistically-rewriting proxy and a wiki of things that break could be set up…

1 Like

I was not able to get nix-user-chroot working on a cluster due to namespace issues and old kernel version.

On the other hand, I have Nix running on a CentOS 6 cluster with /nix mounted using PRoot. It didn’t require root access or containers. IIRC the only thing I needed to change was

use-sqlite-wal = false

in ~/.config/nix/nix.conf to avoid corrupting the Nix database on NFS.

1 Like

Just a small data point but one of my colleagues spent about 2 weeks trying to get scikits-odes to install as it requires LAPACK and SUNDIALS and for SUNDIALS to be built with LAPACK. I wrote a nix derivation saving everyone else two weeks of their lives.

https://scikits-odes.readthedocs.io/en/stable/installation.html#id1

https://scikits-odes.readthedocs.io/en/stable/installation.html#troubleshooting

https://scikits-odes.readthedocs.io/en/stable/installation.html#using-nix

5 Likes

Based on my overlap in being a physicist and using Nix: you’re really really underselling reproducibility. The ablilty to reproduce, say, a graph from your inputs in your past paper is precious, as in, one step away of reputations being at stake.

14 Likes

I don’t want to go too much off-topic, but apparently this problem is largely solved by mamba, which uses a much faster SAT solver.

(I don’t have any experience with Mamba, since I do not really use the Python ecosystem outside Jupyter + PyTorch notebooks for validation of my Rust code.)

3 Likes

what’s next, viper? or rattle … another day, another package manager is invented.

The package king is dead, long live the packaging king!

Can someone admit that some language ecosystems and package managers are damaged beyond repair…

I can hear prebuilt dockers containers laughing from the shadows, ready to replace everything for the detriment of everyone.

1 Like

I’m pretty confident that in the long Nix is a way better fit for sci comp needs than conda. Nix Flakes are a real game changer. However, I keep running into some pain points and I don’t see Nix finding wide adoption until those are resolved (but maybe it’s just me getting things wrong).

I’ll use this thread as an excuse to mention some

  • Integration with non-nixos ecosystems
    • Dynamic linking: nix run nixpkgs#poetry run python, pip install ..., ipykernel --user --install (jupyterhub with user kernels) all ultimately fail the moment you import a .so because the interpreter won’t have the absolute RPATH set. This may well be the biggest pain point I have by this day
    • GL apps. There’s nixGLNvidia but last time I checked it was still incompatible with Nix Flakes. Even if it was compatible - I couldn’t convince people to accept this kind of UX, and I tried.
    • Nix is invasive.
      • The default instruction to install nix is to curl ... | sh which new people treat as fishy.
      • Nix wants /nix (although I see above linked posts on statically built nix and custom store locations)
      • dockertools.buildImage is great, but it needs nix on the host, opposed to a dockerfile one can just specify in docker-compose.yml in a repo that non-nix people can build
      • I tried using Nix to provide bazel and toolchain and to me it was great, but for the reasons above - instructing other (non-nix) people on how to get it running on their system has been quite frustrating
        • And I couldn’t “just give them the .so binary” because I need first to undo the RPATHs
  • Documentation, observability
    • Nix is dynamically typed and uses rather complex mechanisms like overlays and callPackage that give some “ad hoc” vibes. Each time I run into a new function - I have to begin a look up starting in all-packages.nix to figure out which file the name corresponds to, and even then I sometimes fail to infer the function’s signature. I keep doing this (until I find a better solution), but I can’t expect other people to commit to this
  • Cache misses. At times a huge build would trigger and it may be hard to figure out how to adjust the inputs so that a cached derivation is used instead
  • CUDA in Flakes means import nixpkgs { allowUnfree = true; ...} and it’s ad hoc again
  • UPD: just tried fetching opencv-python with poetry2nix (which has been mentioned above) and 1) it didn’t try to fetch a ready wheel, running a full build instead, 2) that build has failed; virtualenv way works with the same lock file. That is all to say - the UX isn’t smooth enough just yet, a lot of work needs to be done before we can expect scicomp people widely switching to Nix

Most of these sure can be worked around, and I wish I was more constructive/involved in improving things, but for now there’s just that…

5 Likes

Mach-nix from @DavHau has been working very well to me.

1 Like

Mach-nix is superb! It would be even greater if it supported pep517/pep518 (but then again, I haven’t done anything to help it support them)