Idea/RFC: Allow processes to declare Nix GC roots

sliedes · December 7, 2024, 5:55am

Nix goes into lengths to try to not garbage collect store paths that are in use. It goes through the processes and finds store paths from their /proc/$pid/environ, fds, maps, lsof output, etc:

github.com

NixOS/nix/blob/ab5a9cf2db31d3840a801385349b9d23deb29ecc/src/libstore/gc.cc#L362


      
          auto procDir = AutoCloseDir{opendir("/proc")};
          if (procDir) {
              struct dirent * ent;
              auto digitsRegex = std::regex(R"(^\d+$)");
              auto mapRegex = std::regex(R"(^\s*\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+(/\S+)\s*$)");
              auto storePathRegex = std::regex(quoteRegexChars(storeDir) + R"(/[0-9a-z]+[0-9a-zA-Z\+\-\._\?=]*)");
              while (errno = 0, ent = readdir(procDir.get())) {
                  checkInterrupt();
                  if (std::regex_match(ent->d_name, digitsRegex)) {
                      try {
                          readProcLink(fmt("/proc/%s/exe" ,ent->d_name), unchecked);
                          readProcLink(fmt("/proc/%s/cwd", ent->d_name), unchecked);
          
                          auto fdStr = fmt("/proc/%s/fd", ent->d_name);
                          auto fdDir = AutoCloseDir(opendir(fdStr.c_str()));
                          if (!fdDir) {
                              if (errno == ENOENT || errno == EACCES)
                                  continue;
                              throw SysError("opening %1%", fdStr);
                          }
                          struct dirent * fd_ent;

This is obviously not, and cannot be, entirely perfect. For example, one issue I’m running into is with nix-locate’s command-not-found hook: /etc/bashrc sources command-not-found.sh from nix-index-with-db, which defines a bash function command_not_found_handle that has a hardcoded path to /nix/store/s8pf5...-nix-index-with-db-0.1.8/bin/nix-locate.

None of this appears in anything that Nix would find GC roots in , so nix-index-with-db can get garbage collected. A bash environment variable also doesn’t help since /proc/$pid/environ contains the initial environment of the bash.

Now I consider this exact issue minor and probably not worth going into great lengths to avoid, but perhaps there are other similar issues that could be solved in a similar fashion?

So I’m wondering if it would be possible to define a way to declare that a certain process depends on a certain store path.

Obviously a process opening a file would be enough, but that’s perhaps a bit annoying; getting bash to open and keep open /etc/bashrc open would be sufficient, but feels like it might require modifying bash, and it would consume a file descriptor (or it could map it).

Instead I’m thinking maybe it would be possible to develop a mechanism to attach such information to a specific process from outside, so we could have something like

nix-register-gc-roots --pid $pid /nix/store/...

This could use something like (pid, start_time) as a unique identifier for that process so that reused pids won’t prevent GC.

Would something like this be desirable? Overly complex for the benefits?

mightyiam · December 7, 2024, 1:53pm

Can’t that program be implemented in a way that does end up with a gcroot?

waffle8946 · December 7, 2024, 7:54pm

Seems to already be addressed in the OP. And programs caring about whether nix is on the system would be even stranger.