Don't expose Source Code in Binary Cache

Hey,

I’m currently preparing an internal presentation at my company to promote the usage of nix. However, I’m having a hard time getting my head around a particular problem:

At our company we take care of thousands of SBC devices (from here on referred to as “clients”) out in the wild running some kind of linux distribution. Ideally I would want to replace the package management on those devices with nix (later down the line potentially even the complete OS with NixOS). However there are some concerns regarding the usage of nix in conjunction with closed source software. The following is an overview of those concerns (this are not necessarily my concerns as I’m also a huge fan of open source software, and they are ordered from most paranoid to lesser paranoid :wink: ).

  1. Keeping the building process secret
    We cannot rely on pushing or copying binary artifacts to the clients via nixops or nix copy --to or something similar as most of them are behind a NAT. In addition users should be able to install packages on the clients locally (or later down the road configure the NixOS configuration locally). For me this means that the nix expressions that define the packages must be exposed to the clients so that anyone who has access to them (on a system level) might have full insights into all the things that are relevant to build our software.
    The only solution I found so far is to utilize nix copy --from http://mybinarycache /nix/store/3jwkkdj6wkjrascn088lzc50617q585ag-super-secret-software-1-0-0. This does seem to work but it’s not as convenient as doing a simple nix-env --install but that might be nitpicking. I just stumbled upon the possibility to “install” the software this way, so I could have deleted this paragraph but I wanted to also get your opinion if this is the right way to do it or if there are better ways. (PS: Is there some way like with nix-env --install to get the store path into the profile? nix-env --install only seems to accept derivations)

  2. Don’t expose Source Code in Binary Cache
    This is actually the most important point. Right now I have a NixOS build machine that builds my derivations. I’m not actually going to host the /nix/store of this build machine with nix-serve or something similar as this would potentially expose the source files used as inputs for the derivations. Right now for demo purposes I selectively nix copy the artifacts I want to deploy to a separate binary cache on the same machine (btw. can I simply host this binary cache with nginx for example? If I understand it correctly this should “just work”?). However, one thing troubles me: What if somehow a “source” store path finds its way into the binary cache? E.g. if i screw up my derivation at some point, something along the lines of

{ pkgs ? import <nixpkgs> { } }:

pkgs.stdenv.mkDerivation {
  pname = "hello";
  version = "0.1.0";

  src = builtins.fetchTarball {
    url = "https://ftp.gnu.org/gnu/hello/hello-2.12.tar.gz";
    sha256 = "sha256:1mc1vrixpkzkdnvpzn3b01awvha6z7k2dnpai3c6g89in8l1wr70";
  };

  postInstall = ''
    echo "$src" > $out/foo
  '';

}

Reading through the nix manual I found disallowedRequisites which does seem to solve the problem. I could overlay mkDerivation to default this parameter to src or something similar I guess but I’d rather tackle the problem at a later stage in the deployment pipeline (e.g. when nix copy to the binary cache or even let nginx figure out if it is allowed to serve an archive).

The binary cache would be at least protected with an access token / basic auth and the source tarballs / git repos are inaccessible from the internet anyway.

I’m not 100% sure where I’m going with this post I just don’t have a good feeling with the solutions I came up with from scraping the manuals. I guess I just need some input or discussions from you. My search-engine-foo doesn’t seem to be good enough to give me a satisfying answer and I’m almost certain that somewhere someone was faced with a similar problem.

If you have any opinion, suggestion or even better experience with such a deployment please let me know. I’m open for any critique, suggestion or what have you :wink:

5 Likes

I never had this kind of problem, but I would try with two nix (expression) repositories.
One to build “binaries” (or tar.gz), private, and a nix cache only for it.

Other to download them, not from nix cache but from your http server, ‘public’ (SBC could access it), could be a cache for them, but not sure if there are any advantage

private:

{ pkgs }:

pkgs.stdenv.mkDerivation {
  pname = "hello-src";
  version = "0.1.0";
  src = builtins.fetchTarball {
    url = "https://ftp.gnu.org/gnu/hello/hello-2.12.tar.gz";
    sha256 = "sha256:1mc1vrixpkzkdnvpzn3b01awvha6z7k2dnpai3c6g89in8l1wr70";
  };
  postBuild = ''tar bin/hello $out/hello-bin.tar.gz'';
}

Configure post build hook of your build server, it should push to your.ftp.server

{ pkgs }:
pkgs.stdenv.mkDerivation {
  pname = "hello-bin";
  version = "0.1.0";
  src = builtins.fetchTarball {
    url = "https://your.ftp.server/hello-bin.tar.gz";
    sha256 = "sha256:ASDFASDFASDFASDFASDFASDFASDFASDFASDFASDFASDF";
  };
}

Sadly, that creates two problems:
1 - build and deploy nixpkgs have to match (or you have to deal with makeWrapper and patchelf)
2 - you need some process to update hashes and dependencies (code duplication)

Mhhh, interesting idea. To be honest I don’t really know how I feel about it :smiley: On the one hand this definitely reduces the risk of exposing the source code a lot but on the other hand this would increase the complexity of the whole deployment too. It could even very easily lead to a catastrophic situation (as you mentioned) when the nixpkgs revisions mismatch.

I have to think about it, thank you very much for your valuable input. And btw. haven’t come across the post build hook for the builders, this will come in very handy in either case. Thanks! :slight_smile:

This other solution might help with 1 not sure about 2.
https://github.com/timbertson/runix

One of the reasons nix works so well is it treats software distribution as a caching layer for software building. A lot of the traditional problems from software distribution disappear when you make this assumption.

It isn’t surprising that things get a little awkward when you try and reintroduce a distinction between distribution and building.

4 Likes