Smoothing the flakes learning curve

jacg · February 24, 2021, 2:45pm

TLDR

The Why

I want to provide a thoroughly documented trail up the Nix flakes learning curve.

One of the big problems I have with learning Nix in general is that—in the context of its huge complexity—typos, outdated information, assumptions made or shortcuts taken by the author, often lead to waste of vast amounts of precious time on the part of the reader.

I want to start with just about the most trivial flake that does something, and take small steps which introduce some aspect or feature, and build up to something of arbitrary complexity.

Crucially, each of these steps should consist of self-contained, complete, working code, documented with examples of usage which are automatically tested for correctness.

The How

This presents me with some difficulties:

IIANM, a flake is a version-controlled filesystem-tree with a flake.nix at its root. Two different steps in my progression of examples must therefore be distinct version-controlled filesystem-trees with a flake.nix at their roots. The obvious way of storing these seems to be as different commits in a Git repo. Thus my final product will be the whole history (cleaned up, with one didactic step per commit) of a Git repo. But how on earth do you version-control this product? IOW, how do you version control the evolution of the entire history stored within a Git repo (that is, a sequence (nay, graph) of trees), as opposed version controlling the evolution of a single filesystem tree?

Furthermore, introducing changes to this is a major PITA, as any change in one of the earlier steps tends to require a lot of rebasing to clean up the whole set. I can imagine that this might become intractable as the number of steps grows.
I’ve tried to use shellspec to write the tests. I’m not at all familiar with it, so maybe I simply haven’t spotted some obvious feature that it provides for dealing with this: Consider something like nix develop; I would like to write some tests which verify features of the environment inside the shell created by nix develop, but shellspec just hangs there until the shell exits.
The source of the BDD-style shellspec tests is fairly legible, but it would be better if these could be gathered together in a single, pretty, mdbook book (or similar). But given that the separate steps (and their corresponding documenting tests) are scattered across different commits in the repo, this is going to be somewhat fiddly.

The What

My attempt at getting (the first two aspects of) this going can be found here. So far it only contains the outline of three steps:

A single-system, single-package flake.
Generalize it to multiple systems.
Introduce a defaultPackage.

To run the tests of any single step manually:

nix run github:jacg/flakes-learning-curve/shellspec

(shellspec is not available in nixpkgs, so I’ve put it in a flake in a orphan branch of the same repo. I’ll try to submit shellspec to nixpkgs Any Moment Now.)

There is a prototype GitHub action which runs the whole lot. Very pedestrian at the moment: it manually checks out each of the branches corresponding to the steps in the sequence, in turn, and re-runs the tests.

Ideas?

Can you suggest a better way of organizing and presenting theses steps, while maintaining the requirement of automatically verifying the correctness of the samples?

This structure seems very difficult to work with, so I don’t want to invest too much time and effort into it, if a more manageable approach can be found.

jorsn · February 24, 2021, 4:05pm

You are mistaken with at its root, and it’s not necessarily version-controlled (see tarball, path fetchers and nix flakes: flakes in subdirectories can access files in parent directories (breaking the implied seal of flake.nix) · Issue #4414 · NixOS/nix · GitHub).

Maybe version control a script generating the repo?

Use pijul or darcs, stop rebasing and write a flake fetcher for those VCSs (examples are here).

jacg · February 24, 2021, 5:00pm

Yes, thought about that. Not sure how that would fit in with allowing me to fiddle around and run tests in any given step.

Te-he … darcs was the first D(istributed)VCS I ever used, and I’m keeping half an eye on pijul, hoping that it will take over the world before I kick the bucket. However, even though I believe that Git’s victory in the VCS popularity contest was a Bad Thing for the industry, Git has Magit, and life is too short to use a VCS that doesn’t have Magit.

roni · February 25, 2021, 8:36pm

This is a neat idea, but seems really unwieldy in the end.

I am pretty old school with this sort of thing: I’d suggest that you simply have a bunch of “versioned” files, each showing the next step in the process. Then, you version control the set in the usual way (by using Git for its stated purpose). This approach is unsexy and manually somewhat tedious, but I think it would be easier to manage and easier to consume than storing the tutorial evolution in Git rather than explicitly as parallel files.

BTW, I think this effort is an awesome idea and I’m excited to see how it develops. I want to learn about flakes and other best practices around Nix/NixOS but I find I have neither the time nor the mental energy to do it. The next best thing is to equip myself with the tutorials coming from others’ hard work and use those like Ripley’s powered exoskeleton to help me learn things in a more supercharged kind of way.

roni

jacg · February 25, 2021, 9:36pm

Very unwieldy indeed.

Not entirely sure what you mean. Do you mean that each Git commit should contain multiple steps in the process? If so, my problem is that this would require me to have multiple files with the exact name flake.nix at the root of the repo! (One corresponding to each step.)

I am now aware that flake.nix is allowed to exist elsewhere too, though I haven’t had the time to look into exactly what this means. But in roughly 99% of typical cases flake.nix will be in the root of the repo, and that’s certainly where it should be when explaining things to beginners: doing anything else would be confusing.

raboof · February 26, 2021, 5:52am

Another option might be to have each step in its own branch. That way, if you want to do maintenance, you just make the change in the first step where it applies, and merge that into all subsequent steps. Still some tedious work, but I’m not sure that can be prevented entirely. The downside is of course that you can not represent the maintenance as a single PR, so this is not great if you expect to collaborate on this much.

Thus my final product will be the whole history

How will visitors consume this?

Another option might be to have one ‘master’ repository with a flake.nix for each step (so either renamed or not at the root), and a script to convert those into separate branches/commits on a repo for consumption by your visitors.

jacg · February 26, 2021, 7:58am

Here we have to address a linguistic ambiguity: In Git, a branch is just a label which is typically (but not necessarily) used to track the growth of a topological branch of the DAG of commits that Git is tracking.

My example repo already has each step on its own Git branch, but those Git branches are on the same topological branch. To wit:

step-1 <- step-2 <- step-3
shellspec

where the step-Ns are all Git branches labelling commits on the same topological branch, and shellspec is Git branch labelling an orphan commit which (for the time being) contains the shellspec flake used in the tests in the steps.

I interpret your suggestion to be that each step should evolve on separate topological branches:

c88412b <- 11ef8c3 <- 4533c95 <- step-1
9d27c50 <- 2612b61 <- 81b4bc9 <- 50b3664 <- step-2
4547fb5 <- 013ddc8 <- step-3
shellspec

with one disconnected sub-DAG for each step’s history.

My original has multiple Git branches on a single, connected DAG
Your proposal has multiple disconnected (sub-)DAGs with a Git branch at the HEAD of each.

Is that what you meant? If so, then yes, I have been thinking about this, and it offers some advantages and some disadvantages.

Exactly! This is true in both cases. And collaboration would be most welcome.

An mdbook which discusses each step, referencing the branch which contains its implementation.
git log --decorate --graph --all --oneline … preferably via some more interactive and pretty interface that they happen to have available.

This will work nicely in my original idea of multiple Git-branches on a single connected DAG, with no historical commits in between; but not so well in the second model with disconnected DAGs and historical commits.

You mean one repo per step, rather than one commit-in-the-same-repo per step?

If so, then I don’t see what advantages it offers over the disjont-sub-DAGs-in-one-repo model. Plenty of disadvantages though. So I guess I have misunderstood you.

I suspect that in the end the presentation repo’s history will have to be generated by some script, from the contents of a (probably) separate repo in which the whole thing is developed. What worries me here is how to test stuff quickly and interactively in the development repo: the flakes in the development repo should work identically (and have the same external interface including stuff like nix run github:user/repo/branch!) to the ‘official’ ones.

raboof · February 26, 2021, 8:25am

jacg:

I interpret your suggestion to be that each step should evolve on separate topological branches:
c88412b <- 11ef8c3 <- 4533c95 <- step-1
9d27c50 <- 2612b61 <- 81b4bc9 <- 50b3664 <- step-2
4547fb5 <- 013ddc8 <- step-3
shellspec
with one disconnected sub-DAG for each step’s history.

My original has multiple Git branches on a single, connected DAG

Your proposal has multiple disconnected (sub-)DAGs with a Git branch at the HEAD of each.

I was thinking of having merge commits. This gets hard to represent in ascii art though . Anyway let’s not go into this further since if I understand correctly you want to be able to do updates touching multiple steps in a single MR.

That was exactly my thinking. So the ‘master repo’ might have all steps side-by-side on the same commit (for example flake-step1.nix, flake-step1.nix, etc or step1/flake.nix, step2/flake.nix etc). There you can collaborate and update all steps in 1 MR.

That won’t work, but perhaps that is not such a big problem: if generating the ‘visitor-facing’ representation of the flakes repo is fast then you can just run that script each time you want to test - perhaps even automatically on file changes.

jacg · February 26, 2021, 8:26am

Exactly.

What’s keeping me going is the conviction that, in the long run I will need far less time and energy to produce results in Real Work, with Nix than without it. Especially if I manage to get my colleagues to embrace it. I am sure that the whole team would be so much more productive if we had decent Nix infrastructure and processes in place. But for this to happen I need to:

Get myself through the phase where getting anything done in Nix is a huge time and energy sink. (The good news is that once I have figured out how to do something, the benefits are reusable and hugely reliable. The bad news is that figuring out each new thing is veeeeeery expensive, and there are so many ‘things’ to learn about.)
Get the team to even begin to understand the point of Nix, and get them to appreciate that it can be hugely relevant to their comfort and productivity. (I think that flakes can be very important here!)
Find a way of quickly and cheaply educating
- everyone on the team to be competent Nix consumers
- a few key members to be Nix-providers

There are Real Work milestones looming on the horizon and I have been overspending on my Nix time budget recently: my time and energy available for Nix will soon drop to zero for a while.

jacg · February 26, 2021, 8:33am

This is crucial if there’s to be any collaboration on this.

Yup, this might be the key observation.

roni · February 27, 2021, 7:04pm

Sorry, I was very unclear in my previous response.

What I was suggesting was to simply maintain several directories side by side, each showing one step of the evolution. So you’d have a step1 directory, with a flake.nix file in it showing your initial bootstrap. Then a step2 with the same flake.nix plus whatever modifications you want to show in that step. Then on and on until you have stepN containing your final, working flake example.

Now you have your evolution encoded as a sequence of full directory snapshots in a single Git commit, and Git does its ordinary job of versioning changes to your whole tutorial. With this setup you could even generate diffs between the steps if you wanted, which you could use in a tutorial similarly to how the Webpack docs do it (Getting Started | webpack).

So you see how this is an unsexy, keep-it-simple sort of approach. But in the end perhaps that simplicity would be useful.

roni

jacg · February 27, 2021, 11:50pm

This is how I would have organized it in the first place, were it not for the fact that the flake.nix at the root of the repo has special significance which is key to a major attraction of flakes, namely that it determines the behaviour of things like

`nix build github:user/repo[/branch]
`nix shell github:user/repo[/branch [–command command]]
`nix run github:user/repo[/branch [-- args]]

If you look at the examples of usage, many of them (for instance this one) show off these kinds of uses, and these require each example’s flake.nix to be at the root of the repo.

Do you now see why I need flake.nix to be at the root, and not in some subdirectory?

I still haven’t had the time to look into the exact meaning of flake.nix in subdirectories, but whatever it is, even if it allows the use of commands similar to those shown above, it will (at the very least) introduce some complication of the syntax, and that would be very unhelpful when trying to show off the beauty, simplicity and power of the interface, when trying to sell the whole idea to outsiders, and when trying to present the simplest and least confusing way of writing simple flakes.

flake.nix at the root of the repo is the 99% case, and an introduction that aims to be clear and simple shouldn’t contain examples whose structure is extremely unusual.

roni · February 28, 2021, 2:29am

Ahh yes, makes sense now. Thanks for bearing with me on this.

How about using separate repos in a new org that you create for this purpose? Then you could maybe do coordinated updates by making another repo with submodules pointing to the “real” repos and doing things there. I dunno, that doesn’t seem ideal either.

More generally, it’s too bad you can’t also specify a full path within a flake repository for the flake commands to use. It prevents the use of a monorepo sort of structure (along the lines I was suggesting originally). But I don’t know much about flakes at all, so perhaps there’s good reason for these constraints.

roni

jorsn · March 1, 2021, 12:15am

What do you mean? Typing nix shell /path/to/flake#package?
Or refering to a subdirectory of a flake in the flake url, like this

{
  inputs.subflake.url = "github:jorsn/testflake?dir=subflake";

  outputs = { self, subflake }: {

    packages.x86_64-linux = {
      inherit (subflake.inputs.nixpkgs.legacyPackages.x86_64-linux) hello;
    };
    defaultPackage.x86_64-linux = self.packages.x86_64-linux.hello;

  };
}

Note that the subdirectory-flakes are not isolated from each other:

jacg · March 1, 2021, 8:28am

You can, but … as Stephen Hawking famously noted, each equation in his (popular science) book would halve the readership.

So it is here: the intended readership are Nix non-experts and each complication will halve the readership. Thus I don’t care that it’s possible to place flake.nix in subdirectories; it deviates from the norm and presents the reader with a more complicated structure than is strictly necessary to get going with flakes, thereby halving the readership and drastically reducing the usefulness of the whole endeavour.

The reader needs to see a single flake.nix at the root of the repo. (At least in the first 25 steps, or so.)