Building a statically linked nix for hpc environments

You probably should swap the /home/eelco for $HOME

Okay, you got me. I didn’t try the exact command line. I adapted $HOME. :wink:

Hi there,

I am a Nix newbie and I recently joined the forum. I am having the same issues as @pimiddy trying to run Nix as an unprivileged user on a HPC environment without user namespaces. Is there any update on this?

I managed to get the static nix binary and run it as @edolstra write. Though I am getting several messages like the following when I try to build a derivation (I guess one message for each build dependency of the derivation) :

asked 'https://cache.nixos.org' for '/home/emilio/.nix/store/14xlcm1m7sg1l4di7g9p2kk8pf2l6m4f-hello-2.10.tar.gz' but got '/nix/store/3x7dwzq014bblazs7kq20p9hyzz0qh8g-hello-2.10.tar.gz'

and then:

[0/96 built, 0/1 copied] Real-time signal 0

I am on a HPC cluster with an NFS mounted filesystem. Any help would be really appreciated.

Thanks a lot.

Best,
Emilio

Have you had a look at nix-portable?

1 Like

This error is caused by musl’s nonstandard use of signals. For a quick fix, you may remove all calls to setrlimit from nix.
Ref: Investigate why setrlimit setting in PAL on Alpine in breaks GDB · Issue #6767 · dotnet/runtime · GitHub

Thanks @j-k! I have just tried nix-portable but I get this error when I try to run anything, e.g.:

$ NP_DEBUG=2 ./nix-portable nix-shell -p hello
...
proot error: execve("/nix/store/p8d4qqiqcmx935m2b5a1gsmr6sp1ihsn-nix-2.4pre20201201_5a6ddb3/bin/nix-build"): No such file or directory
proot info: possible causes:
  * the program is a script but its interpreter (eg. /bin/sh) was not found;
  * the program is an ELF but its interpreter (eg. ld-linux.so) was not found;
  * the program is a foreign binary but qemu was not specified;
  * qemu does not work correctly (if specified);
  * the loader was not found or doesn't work.
fatal error: see `proot --help`.
proot error: trying to remove a directory outside of '/tmp', please report this error.

proot error: can't chmod '/tmp/proot-13030-ueN2q1': No such file or directory
+ echo 'Fatal error: nix is unable to build packages'
Fatal error: nix is unable to build packages
+ exit 1

Hi @NickCao, thanks for your reply. I understand this would mean recompiling nix from source. Though, I am not sure I would really like to do so…

If you could open an issue on Issues · DavHau/nix-portable · GitHub with a full log that’d be great. I wouldn’t want to swamp this thread with debugging :sweat_smile:

I am in the same situation - I don’t have usernamespaces available on our HPC and it is running on NFS. The process of compiling it on a few computers seems a bit daunting and I would like to avoid it if possible. I also thought maybe I could compile it to a static binary (with the store in my home-folder baked in at compile time) but I couldn’t find how to compile it into a static version.

Did you solve the issue? What I thought would be helpful to have and I am happy to try this, is to have a docker container with all dependencies included so that it is easy to compile a custom binary.

Did you ever solve the issue you described in this thread and are there any instructions?

Thanks

1 Like

I think you want Building a statically linked nix for hpc environments - #16 by edolstra

1 Like

@knedlsepp Thanks, this looks promising and I tried it, but just get an error:

warning: error: unable to download 'https://github.com/NixOS/flake-registry/raw/master/flake-registry.json': SSL peer certificate or SSH remote key was not OK (60); retrying in 310 ms

@hhoeflin: I think you need to set NIX_SSL_CERT_FILE. That said, I haven’t actually tried @edolstra way of doing it, which obviously is preferable. GitHub - danielbarter/hpc-nix worked on every cluster i tried it on, but the compile times are pretty painful (everything is rebuilt from scratch) and having the whole nix store in your home dir is inconvenient because of disk quotas.

I ran in to two more issues which aren’t really avoidable:

  1. If you are doing large multinode computations, you can run into problems where lots of workers are attempting to load shared libraries from your home directory. This is problematic because on clusters, the home dirs are usually NFS and I was seeing mmaps timing out
  2. The compilers from nixpkgs aren’t as good as the cluster specific ones on the cluster.

In the end, i just use the cluster environment as is, which isn’t ideal. I wonder if there are any HPC admins out there who like nix? It would be so amazing to have it baked into the system

@danielbarter Thanks for letting me know and helping! Will have a look.

to what exactly should I set the NIX_SSL_CERT_FILE? The cluster one or something else?

yeah, the cluster one

1 Like
Hosted by Flying Circus.