I’m creating a flake template that I want to use widely in various projects.
However, I don’t trust myself , and before I use this or any flake (template) widely, let alone put it in front of clients, I want to test the flakes does what I think it does
.
For example, my template flake uses the wonderful treefmt-nix, to ensure (in this OCD/toy-example) that any *.md
s are correctly formatted :
# flake.nix
{
description = "The template flake";
inputs.treefmt-nix.url = "github:numtide/treefmt-nix";
outputs = { self, nixpkgs, systems, treefmt-nix }:
let
eachSystem = f: nixpkgs.lib.genAttrs (import systems) (system: f nixpkgs.legacyPackages.${system});
treefmtEval = eachSystem (pkgs: treefmt-nix.lib.evalModule pkgs { programs.mdformat.enable = true; });
in
{
formatter = eachSystem (pkgs: treefmtEval.${pkgs.system}.config.build.wrapper);
checks = eachSystem (pkgs: {
formatting = treefmtEval.${pkgs.system}.config.build.check self;
});
};
}
So I would expect this flake to fail nix flake check
if there is a poorly formatted README.md
, and succeed if there is a properly formatted README.md
.
At first I thought I should probably use some external testing library to deal with the test fixtures (a fresh temp directory, nix flake init
, cleanup, etc.). But then it occurred to me that nix already has such an isolated, reproducible environment – the derivation.
So I hacked together this “host” flake (which includes the above template):
{
description = "The host flake";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.11";
flake-checker.url = "github:DeterminateSystems/flake-checker";
flake-iter.url = "github:DeterminateSystems/flake-iter";
};
outputs = { self, nixpkgs, flake-iter, flake-checker }:
let
supportedSystems = [ "x86_64-linux" "aarch64-darwin" "x86_64-darwin" ];
forEachSupportedSystem = f: nixpkgs.lib.genAttrs supportedSystems (system: f {
pkgs = import nixpkgs { inherit system; };
});
in {
checks = forEachSupportedSystem ({ pkgs}: {
test-template-bad-readme =
pkgs.runCommand "test-bad-readme"
{
buildInputs = [
pkgs.nix
];
__noChroot = true;
NIX_CONFIG = "experimental-features = nix-command flakes";
}
''
mkdir -p $TMPDIR/home
export HOME=$TMPDIR/home
nix flake init --template ${self}#default
echo "bad _markdown*" > README.md
if ! nix flake check; then
touch $out
else
echo "Test failed: nix flake check succeeded unexpectedly"
exit 1
fi
'';
test-template-good-readme =
pkgs.runCommand "test-good-readme"
{
buildInputs = [
pkgs.nix
];
__noChroot = true;
NIX_CONFIG = "experimental-features = nix-command flakes";
}
''
mkdir -p $TMPDIR/home
export HOME=$TMPDIR/home
nix flake init --template ${self}#default
echo "good *markdown*" > README.md
if nix flake check; then
touch $out
else
echo "Test failed: nix flake check failed unexpectedly"
exit 1
fi
'';
});
devShells = forEachSupportedSystem ({ pkgs }: {
default = pkgs.mkShell {
packages = with pkgs; [
nixpkgs-fmt
flake-iter.packages.${system}.default
flake-checker.packages.${system}.default
];
};
});
templates = {
default = {
description = "a template";
path = ./templates/default;
welcomeText = ''
Welcome to the default template
'';
};
};
};
}
This kinda works, but before expanding on this I want to get some feedback on
- whether this pattern makes sense,
- is idiomatic,
- how other people are testing their flakes,
- … what I’m surely missing
Strong opinions welcome !
Full toy repo here.
Some more details, background and problems of the above:
Existing Tools
I did look at existing nix testing libraries, but
- nix-unit, nixt and namaka all look great to add better ergonomics around this – but they don’t seem to provide a framework or helpers for dealing with the text fixtures (working dir, etc.)
- NixOS Testing seems amazing, but I think it’s more than I need. I don’t need VMs; I don’t consider cross-platform flake compatibility to be my concern.
Tools in 1) seem insufficient because they’re about unit testing, and 2) seems overkill because it’s about integration testing.
I think what I mean by testing a flake is an end-to-end test, so right in the middle.
It’s also hard to search for “testing flakes”, because most of the hits are about running tests inside flakes, so it’s very possible I missed something.
Separation of (testing) concerns
I’m also aware that I can’t go out and just test every behavior (say, whether treefmt-nix
really works on Ubuntu 22.02 + nix 18.02, or whether it’s markdown spec is correct or whatever).
But it does seem that because my flake added a test for proper formatting, it is properly my responsibility to test that it does that in principle (i.e. using a toy example).
Why test?
I find few things more frustrating and wasteful than bad DevOps, build tools and CI/CD, because its literally multiplies the debugging complexity:
Is this bug a bug with my code, or with the flake ?
So my experience has been that investing heavily in testing is worth it.
Problems with the above example
- This requires running nix-inside-nix, which seems quite complicated, and because I’m not testing nix, but my flake, I don’t actually find it necessary. In docker, there’s a way to avoid docker-in-docker by exposing the docker socket of the host of the parent container to the child container, thereby effectively creating “sibling” containers. Is such a thing possible in nix flake checks?
- Because the flake template in the host’s flake needs to get all its dependencies, this only works when breaking the
sandbox = false
. Any way to avoid this? If I ship the template with aflake.lock
, is there a way I can let the host flake calculate all these derivations, and “pass” them (their store paths?) to the derivation process of the template? This way, the (nested)nix flake check
could work inside the sandbox, no? touch $out
to mark success seems … hacky. Is there more idiomatic way to express conditions?- Obviously, the above example is verbose. Are there any existing libraries to help factor this stuff out?