If you’re interested in reproducible experiment managers for ML, check out ck and Popper – ck is sort of a meta-package manager, designed for building reproducible ML workflows. Popper is more for reproducing single papers.
On Spack (disclaimer: I’m the lead developer):
+300K lines of Python2 code would certainly add risks.
Spack is ~56k SLOC of Python for the core and ~90k SLOC for the builtin package repository. There are also some vendored external packages (which we include so that users do not have to install dependencies – it works right out of the repo). All that code is from years of substantial contributions, so I am not sure I understand why it adds rather than mitigates risks. It’s not designed as a library, but we should probably pull a few things out of it as libraries given the work that’s gone into them.
I didn’t see any comments on isolation in Spack
Every build gets its own process with a cleaned environment, and builds have compiler wrappers injected into them to force includes, RPATH
s, library search paths, etc. to point to dependencies. We clean out many user environment variables that affect builds (LD_LIBRARY_PATH
, etc.)
We do not use a chroot
environment. You can bring external packages into a build with the external packages mechanism. If you do this, it makes builds “impure” by Nix standards but it allows us to do things like use preinstalled, often proprietary packages on HPC machines (like the system MPI library, compilers, etc.)
I can’t find any comments about non-deterministic builds
I can’t track down exactly what is meant by “non-deterministic” builds in the Nix community, so I don’t know how much help this will be. But, we try to enable reproducible builds.
The concretizer (resolver) creates a DAG with package, version, compilers, compiler versions, build options, target architecture, etc., and that DAG is hashed recursively. We call this a “concrete” spec, i.e. one with all parameters filled in. Package recipes are templated by these parameters, and rebuilding packages with the same concrete DAG should be reproducible. So, given a spack.lock
file for a spack environment (which contains the concrete DAG), you should be able to reproduce a build deterministically. It’s not bitwise reproducible in the Debian sense (AFAIK neither is Nix) but there should be no variation in the options given to the build or the commands run. We do provide a spack install --dirty
flag in case users insist on preserving their environment settings in the build environment, but we discourage its use.
You can also “re-concretize” the abstract spack.yaml
file from one platform on another, to get a functionally equivalent (but not identical) environment. We’d like to improve our solver to the point that it could produce a resolution that is “as close as possible” to one from another platform, modulo platform-specific constraints, but that is not done yet.
See the recent FOSDEM talks on Spack’s concretizer and our archspec
library for packaging optimized binaries if you’re interested in more details on the motivation for Spack.