Questions on Nix as dev platform for HPCs

ancienttoilet · August 28, 2022, 2:34pm

Hello folks,

I am new to Nix and I am considering it as viable option as development platform and software deployment at the company I work for. The company I work for is involved with DNA research and makes the two platforms available to University users free of charge.

Current situation:
I maintain 2 HPCs, one running on Ubuntu 18.04 and the other one on CentOS 7. Users will request scientific software to be installed onto the platforms. The software is often maintained by amateur or non professional developers. Due to dependencies incompatibility between CentOS 7 and Ubuntu 18.04, the software needs to be compiled and installed for each system.

What I need:
I would like to unify the software deployment in a way that I will have to maintain only a repository catalog instead of two, and the software will have to be compatible with the two systems.
I had sort of achieved this with a Gentoo Prefix, which I did use to install software that required higher GLIBC version than the ones supported by the two systems. However, maintaining the Prefix can be a bit of a pain, and ideally we wouldn’t want to change/update the dependencies of the applications installed.
TLDR; the software will have to be linked with the nix libraries instead of the system ones. And if we could make those dependencies embedded into the application and static/immutable that would be even better.

Also, the users do not have access to the internet, and so the software needs to be pre-installed into the environment. Thus, I do believe nix-shell would not be an option for them. Please correct me if I am wrong.

Would nix be able to achieve all of that? And if so, how would the users be able to call/run applications installed through nix?

Regards

JosW · August 28, 2022, 4:33pm

I don’t have the answer, by no means an HPC expert, but it is a very strange coincidence you ask this at this time.
Just a week ago I strumbled onto the following European project EESSI, European Environment for Scientific Software Installations https://www.eessi-hpc.com/

The European Environment for Scientific Software Installations (EESSI, pronounced as “easy“) is a brand new collaboration between different European HPC sites & industry partners, with the common goal to set up a shared repository of scientific software installations that can be used on a variety of systems, regardless of which flavor/version of Linux distribution or processor architecture is used, or whether it’s a full size HPC cluster, a cloud environment or a personal workstation.

A nice read can be found here, https://onlinelibrary.wiley.com/doi/10.1002/spe.3075
In this article they mention Prefix Gentoo also, in use with Cern CVFMS which is mentioned here in discourse also, Distributing the Nix store with CVMFS+Nix? - #4 by siscia

At thirst I thougt the EESSI was trying to reinvent the wheel because Nix would be able to achieve just what they are searching for?
And just as @toraritte in the above discourse topic was thinking about Flox reading about the CFMFS file system, so was I thinking Flox (@limeytexan @bpiv400) might fit in nicely into the EESSI project wishing.
Turns out Nix is even mentioned as an potential option in the article:

Other potential options for the compatibility layer included Nix38 (and GNU Guix39 which is based on the Nix package manager), which had previously been evaluated for this purpose by Compute Canada,40 though they also currently use Gentoo Prefix (see Section 8.1 for more on the collaboration between EESSI and Compute Canada).

But Gentoo Prefix was choosen for the Compute Canada, it’s overview can be read overhere,
https://www.researchgate.net/publication/334778484_Providing_a_Unified_Software_Environment_for_Canada’s_National_Advanced_Computing_Centers

4.1

In general, Nix provides an environment that is internally consistent but is best developed using Nix itself, through tools such as nix-shell.
Such tools can only be used if the Nix store is writable, an impossibility with CVMFS. Attempting to use Nix in conjunction with other development tools using environment variables can be fragile at times. Moreover, the dependency of the hash on the whole installation recipe, instead of only the dependencies (as, for example, implemented in Spack [8]), causes the store path to change even
if a bug fix or security update needs to be applied, aggravating this problem. We must note however, that some clusters in Compute Canada provide Nix with a writable Nix store, which is completely separate from the software stack described here.

There’s also a talk about the Compute Canada endeavour on YT, https://www.youtube.com/watch?v=n6rqjf3dmI4

By providing the above info I hope that follow up answers on your question can also take this info in account. I for one would be very interested why using Nix in their attempt was no success, and if the more experienced community members or the NixOS foundation could be beneficial in trying to get it working for them? Assuming they still want to try to use Nix though.
Sorry for rambling…

knedlsepp · August 28, 2022, 6:43pm

I’d say it somewhat depends if the software you want to provide to people are “end-user” binaries or libraries and compilers that your users will use to develop software themselves.
If the latter is the case, I think there are quite a few pitfalls involved and this might require not only you but also your users to learn those.
In the first case using nix is certainly an excellent option.

tejing · August 28, 2022, 9:49pm

I don’t know anything specifically about HPCs, but your “what I need” section sounds like a description of exactly what nix can do.

This isn’t actually a problem, just a minor annoyance. You’ll need to make a derivation that pulls all relevant software into its closure, then make sure that derivation is registered as a gcroot on the system. Once that’s done, nix-shell will work without network access because what it needs will already be locally cached. Software can even be tweaked locally and compiled, all without network access, so long as you keep the source tarballs it’s built from in the nix store in a similar way.

markuskowa · August 29, 2022, 10:53am

We use Nix on a daily basis on our compute clusters to perform electronic structure calculations. All software packages are provided through Nix on our clusters. We have two clusters, one runs on NixOS directly and the other runs on CentOS. The applications are 1:1 transferable between both clusters (no rebuilds necessary).
From a user perspective, it is quite straight forward: interactive sessions use nix-shell directly, either via nix-shell -p <packages> or nix-shell /path/to/shell.nix. Batch jobs, that run via the work load manager (Slurm in our case), use the shebang mechanism provided by nix-shell.

If you want to run a Nix environment without internet access, you may want to have your own binary cache.
This could be achieved, for example, by running a Hydra build server which defines job sets for all packages that should be available on the cluster.

ancienttoilet · August 29, 2022, 12:05pm

That sound great @tejing and @markuskowa. This gives me more confidence of the work that lies ahead.

This is an example of software I compile and make available to the users as standalone application loadable as an environment module. But there are many many more.

And so, ideally I would want to automate the deployment of an application using a .nix script, and in that script add the instruction to build the application.

Users won’t compile this kind of software themselves, so is nix-shell and a nix-cache something I still want to look into? The users won’t need for instance gcc or any gnu component that nix provides. They just need to use the scientific software we make available for them.

Regards