How to "deploy" projects organized with nix for interpreted languages, e.g. R

telegott · April 27, 2024, 12:58pm

I’m very new to nix, so please forgive if I haven’t grasped all the concepts.
I have a project in an interpreted language, e.g. R. So far I have come up with this flake.nix.
I have three modes, test, development, production, with shared and distinct packages.

Now I can do e.g. nix develop .#test and I’m dropped into a shell that has only the correct packages and the APP_ENV environment variable set.
How do I go on from here, obviously I don’t only want a development shell but also provide some entry points, e.g. for test a way to run the tests, for production to start a random command that is the entry point.

Ultimately I want to build a docker image from the production set of packages. So far I have a rather peculiar non-nix setup that build two docker images: one base image with all libraries and one that pulls in the base image and slaps the code on top. This way I can greatly reduce CI build times because in most cases I can just pull in the unchanged base image.

I would also be fine to have a setup of regular dockerfiles that internally use nix, orchestrated by shell scripts for now that just install the layers that are necessary, and then have a nix ... entrypoint.

So these are a few questions at the same time, but basically, how do I continue from a development shell to entrypoints for other things like running scripts, and how do I make sure that these scripts don’t result in any build work at runtime?

Thanks alot!

{
  inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
  inputs.flake-utils.url = "github:numtide/flake-utils";

  outputs = { self, nixpkgs, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = nixpkgs.legacyPackages.${system};
        base = with pkgs; [ R ];
        production = base ++ (with pkgs.rPackages; [ box dplyr sf ]);
        test = production ++ (with pkgs.rPackages; [ tinytest ]);
        development = test ++ (with pkgs; [ nixfmt-classic ])
          ++ (with pkgs.rPackages; [ languageserver styler lintr ]);
      in {
        devShells = {
          default = pkgs.mkShell {
            nativeBuildInputs = [ pkgs.bashInteractive ];
            buildInputs = development;
            shellHook = ''
              export APP_ENV="development"
            '';
          };
          test = pkgs.mkShell {
            nativeBuildInputs = [ pkgs.bashInteractive ];
            buildInputs = test;
            shellHook = ''
              export APP_ENV="test"
            '';
          };
          production = pkgs.mkShell {
            nativeBuildInputs = [ pkgs.bashInteractive ];
            buildInputs = production;
            shellHook = ''
              export APP_ENV="production"
            '';
          };
        };
      });
}

rgoulter · April 29, 2024, 11:29am

I’m not too familiar with R code.

Kindof depends what artifacts you have & how you want to run them.

The key thing specific to R and Nix is using Nix to provide the R packages, rather than running install.packages(...). The Nixpkgs manual has a brief section on R, as does the official NixOS Wiki. (Albeit, these are focused on dev environments). (EDIT: update URL to wiki.nixos.org).

As a reasonably simple example of writing a Nix derivation of a trivial R program which uses box:

e.g. mod/hello_world.R has:

#' @export
hello = function (name) {
    message('Hello, ', name, '!')
}

and main.R has:

box::use(mod/hello_world)

hello_world$hello('World')

with a script to run this run.sh:

#!/usr/bin/env bash

Rscript ./main.R

Then a Nix package for this hello world R program could be something like:

{ stdenv
, lib
, makeWrapper
, rWrapper
, rPackages
, coreutils
, which
}:

let
  rEnv = rWrapper.override { packages = [ rPackages.box ]; };
in
stdenv.mkDerivation {
  pname = "hello-world-r";
  version = "1.0.0";
  src = ./.;
  buildInputs = [ makeWrapper ];

  doCheck = true;

  buildPhase = ''
    echo "compilation stuff"
  '';

  checkPhase = ''
    echo "run tests"
  '';

  installPhase = ''
    mkdir $out
    cp -r . $out/
    wrapProgram $out/run.sh --prefix PATH : ${lib.makeBinPath [ rEnv coreutils which ]}
  '';
}

This can be used with an expression such as pkgs.callPackage ./hello.nix {}, e.g. in a default.nix such as:

{ pkgs ? import <nixpkgs> {} }:

pkgs.callPackage ./hello.nix {}

then running nix-build, and running ./result/run.sh outputs “Hello, World!”.

If all you want to do is run one script, then the hello.nix could be much simpler. One maybe not-obvious detail is that wrapProgram puts the R with the installed packages (rEnv = rWrapper.override { ... }) on the PATH that the script run.sh uses.

(coreutils is needed for uname; which is needed by R).

Nix can be used to build Docker images. (e.g. the dockerTools section in the Nix manual.

I was able to build a Docker image with e.g. docker.nix:

{ pkgs ? import <nixpkgs> {} }:

let
  helloWorldR = pkgs.callPackage ./hello.nix {};
in
pkgs.dockerTools.buildImage {
  name = "hello-world-r";
  tag = "latest";
  created = "now";
  copyToRoot = pkgs.buildEnv {
    name = "image-root";
    paths = [
      helloWorldR
      pkgs.bash
      pkgs.coreutils
      pkgs.which
    ];
  };

  extraCommands = "mkdir -m 0777 tmp";

  config.Cmd = [ "/run.sh" ];
}

that could be built with docker-build docker.nix and then loaded into docker with docker load < ./result.

This is a start … but I noticed that the resulting image size is quite large, because the R in Nixpkgs is so large. (About 2GiB, vs the 500MiB or so of the r-base docker image).

telegott · April 30, 2024, 1:18pm

@rgoulter thank you very much for this engaged answer, very helpful! I’ll slowly make my way forward and might come back with a question, but this is a very good starting point. Just for understanding: using rWrapper will automatically include the R package, and R packages and installation could not find each other if I would install them without the rWrapper?

Are there any reasons for the nix-R being so large? Are these fundamental to nix or could that be improved?
Unfortunately this would then mean I could not really use it in production in cases where spinning up a container is time-critical

rgoulter · April 30, 2024, 1:50pm

e.g. with a shell.nix which uses an overridden rWrapper, whether that would find packages from the running R install.packages? I don’t know, sorry.

I noticed that I could use a shell.nix with R and rPackages separately to run hello.R, but the hello.nix package needed the use of an overridden rWrapper. Not sure why.

I used nix-tree to glance at the derivation’s dependencies. (There are other tools).

Seems that the biggest culprits are JDK, gfortran, gcc. – If for no other reason, I guess these are kept around because these tools are mentioned in R’s makeconf? Nix automatically scans for runtime dependencies (as is discussed in a nix pill). – But TBH I’m not sure if that’s what’s happening here.

I’d think that it’d be possible to get R package the same size as the base-R docker image from rocker (or whatever)… but I’m not sure how much work that would be.

telegott · May 7, 2024, 6:27pm

@rgoulter thanks for the explanation! Well, for now I can at least use nix for local projects then, unfortunately using it in deployed projects needs to wait then