Templating for scripts and configuration files

tcurdt · December 6, 2023, 9:37pm

Let’s say I am building a package that needs some configuration files or shell scripts that should be expanded based on some inputs.

Is there some kind of templating language? or what’s usually the way to do this is in nix?

Think of something along the lines of (just an example):

   myscript = nixpkgs.writeScriptBin "myscript" ''
      #!/usr/bin/env bash
     {% for item in items %}
     echo ${FOO_{{item}}}
     {% endfor %}
    '';

jtojnar · December 6, 2023, 10:07pm

Nix strings can contain interpolated expressions. Since Nix is a functional language, you will need to use builtins.map instead of loops. Or more conveniently, lib.concatMapStringsSep from Nixpkgs:

pkgs.writeScriptBin "myscript" ''
  #!/usr/bin/env bash
  ${lib.concatMapStringsSep "\n" (item:
    "echo \${FOO_${item}}"
  ) items}
''

tcurdt · December 6, 2023, 10:23pm

OK - so it’s common not to use some form of template language but go for straight nix.

I can see how this work for the small things - but doesn’t that get very messy if gets more complicated?

Either way - thanks for the answer!

jtojnar · December 7, 2023, 7:09am

It can get messy. But if you need loops, a template language would probably only reduce the messiness by a constant factor so I do not find it particularly compelling.

You can also break the large generated code into smaller reasonably-named Nix variables/functions that are then composed together, similarly how you would break complex templates into fragments/included subtemplates.

If you are not dealing with loops but just a few variables, it might be indeed cleaner to use substitute* functions to replace placeholders in an external file. That is a sort of templating language in Nixpkgs.

Or if you really prefer, you can use any templating language you want. For example j2cli:

pkgs.runCommand
  "myscript"
  {
    passAsFile = [
      "paramsJson"
    ];
    paramsJson = builtins.toJSON {
      items = [
        "socks"
        "towel"
      ];
    };
  }
  '''
    ${pkgs.j2cli}/bin/j2 -f json ${./my-script.j2} "$paramsJsonPath" > "$out"
  ''

Just be aware of issues with Import from Derivation when you are using a derivation builder to build a Nix expression.

tcurdt · December 7, 2023, 12:01pm

Uh, now I finally see the problem with “IFD”.

The output basically depends on the input to exist first. Urgh.
But isn’t this just coming from the fact that the store location is not really fix?

Reminds me a little of the situation of location independent code.
If it’s just about paths - why isn’t it solved the same way?
Going for relative instead of absolute.

jtojnar · December 7, 2023, 9:41pm

Not really. The path will be computed as part of the evaluation just fine.

The issue is that Nix separates evaluation and build phases, and the former is sequential. Normally, when Nix evaluates an import call, it would load the contents of the imported path and then evaluate those. But if the imported path is of an unrealized derivation (the path does not exist in the store), Nix will need to realize (e.g. build) it before it can continue evaluation. This is a huge bottleneck – Nix would want to perform realizations of the outer derivations in parallel but it cannot because it will not know about them until the evaluation and instantiation finishes.

I recommend reading https://nixos.org/manual/nix/stable/language/import-from-derivation.html, it explains it pretty well IMO.

tcurdt · December 8, 2023, 11:48am

What I don’t understand yet:

The path within the store is fix. The content is fix (based on the given inputs).
Why would that need to be realized first?

At least to me that’s not clear from the docs.

I can see how that’s needed when the store modification happens outside of nix. (from the docs echo -n hello > $out) but not when all happens within nix.

jtojnar · December 8, 2023, 6:13pm

The path is fixed based on the inputs but the content is not. Unless talking about fixed output derivations, the output is not guaranteed to be the same (though it is a good practice). But that is not the issue. The point is that until a derivation is realized, its output is not available. And when the output contains the Nix expression we are trying to import, we cannot proceed with evaluation until the derivation is realized.

pkgs.writeScriptBin, pkgs.runCommand and other trivial builders are just wrappers around stdenv.mkDerivation, which itself wraps around Nix’s builtins.derivation. If you want to get the output of the derivation they create, you will need to build or otherwise realize it. (Well, you could try to extract the text attribute and replicate the templating logic in pure Nix but then you would not need a derivation in the first place.)

tcurdt · December 8, 2023, 6:37pm

Well, in this very case the nix file does have the full content.

''
  #!/usr/bin/env bash
  ${lib.concatMapStringsSep "\n" (item:
    "echo \${FOO_${item}}"
  ) items}
''

And I don’t see (or understand) yet why this would need to be realized first just because it is supposed to be written to the store folder.

It is a function with one parameter items - all is known.

The fact that writeScriptBin makes it an IFD seems a little unfortunate in this case. I am just trying to understand the reason why this would have to be the case.

jtojnar · December 8, 2023, 7:12pm

Ah, sorry, I could have been clearer. The examples in this thread would not be IFD since they do try to access the contents of the output during Nix evaluation.

The warning was meant to be about doing something like the following:

let
  myScript = pkgs.writeText "answer.nix" ''
    2 * 3 * 7
  '';
in
import myScript

tcurdt · December 8, 2023, 8:16pm

So the fact that writeScriptBin is wrapping mkDerivation is not creating a problem (IDF) in my case as it’s not being used as another input.

Is that the correct way of looking at it?

jtojnar · December 8, 2023, 8:58pm

Using it as an input to derivation would generally be fine as well. For example, Nix can evaluate the following expression just fine without having to build anything:

(let
  myFormula = pkgs.writeText "formula.txt" ''
    2 * 3 * 7
  '';
in
pkgs.runCommand
  "answer.txt"
  { 
    nativeBuildInputs = [
      pkgs.bc
    ];
  }
  ''
    cat "${myFormula}" | bc > "$out"
  '')

Only when you try to pass the output of a derivation to one of the functions listed on Import From Derivation - Nix Reference Manual, you get IFD.

See also the diagrams Import From Derivation - Nix Reference Manual

tcurdt · December 9, 2023, 1:52am

Interesting.

Of course the output will still be deterministic despite passing it through bc.
So it make sense this works without an IFD.

But in theory reading the file back again should not change that fact either.
After all we know what result of the read would be without actually reading it.

But is of course that is harder to reason about on a larger scheme of things.
Is this why functions like “readFile” are just declared as a limit of the evaluation?

What if bc wasn’t deterministic and instead would generate just some random output?

rhendric · December 9, 2023, 2:46am

Whether a build script is deterministic is one question; whether a Nix expression uses IFD is another. They aren’t related.

IFD is only involved if the value of a Nix expression depends on reading store contents. If a build script reads store contents, there’s no IFD.

readFile, for example, returns a string in the Nix expression language, and that string can’t be computed unless and until the file exists in the store. That means that Nix has to evaluate one derivation and realize it in the store before it can even evaluate the next one. That’s why readFile is on the short list of functions that cause IFD.

When a Nix expression references another derivation and coerces that derivation to a string, as in ''cat "${myFormula}" | bc > "$out"'', the value of myFormula that is interpolated in this string is just the path of the result of myFormula, not the contents of files created by its build script. The path can be computed without actually creating it in the store, and that’s what Nix does; it will evaluate both derivations before realizing either of them. So no IFD. It would still be no IFD if bc were not deterministic; that’s just not a related question.

tcurdt · December 9, 2023, 3:22am

Whether a build script is deterministic is one question; whether a Nix expression uses IFD is another. They aren’t related.

Hm. Then I am still missing something.

readFile , for example, returns a string in the Nix expression language, and that string can’t be computed unless and until the file exists in the store

But the content of the file is known. Whether we write that file into the store or not.
And if the content of the file is known at evaluation time - why would a read of the file change that equation?

If nix is/was smart enough it could even avoid the read of the file as an optimization. Not that it would make much sense to write the content and read it again - but I guess we are discussing theoreticals here.

the value of myFormula that is interpolated in this string is just the path of the result of myFormula , not the contents of files created by its build script

This feels like the important part - but I feel I am still stewing on that.

It would still be no IFD if bc were not deterministic; that’s just not a related question.

So the paths don’t change with the content - but only with the input variables? That’s why?

rhendric · December 9, 2023, 3:27am

The content of the file isn’t known to Nix until the build script runs. This could be because the build script is non-deterministic, like date > $out. But much more commonly, it’s because the build script invokes a deterministic but external tool, like a compiler, to produce the output file.

You and I might consider the output of a compiler to be ‘known’ if the source files are known, but Nix doesn’t know what that output will be until it runs the compiler, which it does as part of realization.

Yes, exactly.

tcurdt · December 9, 2023, 4:04am

But cycling back to the beginning…

pkgs.writeScriptBin "myscript" ''
  #!/usr/bin/env bash
  ${lib.concatMapStringsSep "\n" (item:
    "echo \${FOO_${item}}"
  ) items}

Here nix should know the content of the file and the path.
There is only one input being “items” and no side effects.

I understand this not being the same situation when calling a compiler - but if nix was to do a readFile("myscript") I don’t see why it couldn’t optimize the file read away.

There is of course no real practical reason of doing so (that I can think of).
I am just testing out the corner cases to understand this.

Thanks for helping with my mind gymnastics here

rhendric · December 9, 2023, 4:17am

It basically is the same situation as calling a compiler, actually. writeScriptBin is one of the ‘trivial builders’, but they work just like any other derivation: they have a build script that Nix runs, and that script is responsible for producing outputs. If you drill deep enough into the definition of writeScriptBin, you eventually get to this line:

echo -n "$text" > "$target"

It’s just a line in a Bash script, producing an output file that Nix otherwise doesn’t know anything about. Nix treats build scripts the same way whether they’re invoking echo or gcc.

And notice too that this script is defined in the Nixpkgs repo (along with all the other trivial builders), not in Nix itself. That means that Nix would have to be very clever indeed to recognize a readFile pointing to a trivial builder and optimize the IFD away—it’d have to be coupled to the structure of Nixpkgs, or it’d have to statically analyze Bash scripts and find echos to paths it can recognize statically, both of which seem like fraught endeavors. In practice, if you have text you want to write to a file in one derivation and use at evaluation time in another, it’s far more straightforward to store that text in a variable and share the variable, without using readFile.

tcurdt · December 9, 2023, 11:37am

Well, that argument feels a little strange. The contract of writeScriptBin is to write the evaluated string into the file mysscript (AFAIU). Hence the contract dictates that both path and even the content is known at evaluation stage.

…it’d have to be coupled to the structure of Nixpkgs, or it’d have to statically analyze Bash scripts and find echo s to paths it can recognize statically

But does it really? Given the above contract readFile("myscript") would not need any analysis at all. We and even nix should know both path and even content.

I have the feeling that this comment might point to something that might be the last bit of the puzzle…

And notice too that this script is defined in the Nixpkgs repo (along with all the other trivial builders), not in Nix itself.

Are you saying nix is not aware of the implementation of writeScriptBin and hence knows nothing about the contract? (and it would need to be implemented in nix itself to draw conclusion like that contract)

rhendric · December 9, 2023, 4:25pm

I’d phrase it as Nix is only aware of the implementation of writeScriptBin, but that implementation expresses the contract only through a Bash script that Nix can’t be expected to understand.

Imagine that you wrote a Python library called Pypkgs, and in that library you’ve defined a function called writeScriptBin() that calls out to the Python built-in os.system(). The actual text passed to os.system() contains f"echo -n {text} > {target}", and the documentation for writeScriptBin() says that it takes a string argument text and writes it to a file. Is it reasonable to expect a Python interpreter to understand your writeScriptBin() function so thoroughly that it could optimize away a subsequent file.read() call?

It’s technically possible—you could write an interpreter that has special knowledge of the name pypkgs.writeScriptBin, or you could write an interpreter that supports some sort of contract expression language that Pypkgs uses to express that this optimization is possible, or you could write an interpreter that understands Bash well enough to be able to analyze the string passed to os.system() and figure out what it does. But Python doesn’t, in practice, do any of these things, and I don’t think it’s hard to imagine why.

Nixpkgs and Nix are in the same situation. Nixpkgs is a library, interpreted by Nix. Nix knows its own built-ins, but some built-ins involve invoking shell scripts—Nix has mkDerivation in the role of Python’s os.system()¹—and Nix doesn’t interpret shell scripts, only Nix expressions. All Nix knows about non-built-in library functions is their implementation, and all the implementation of writeScriptBin tells Nix is that it invokes mkDerivation with a particular shell script that Nix considers to be a black box. Looking inside the box, or using not-written-in-Nix knowledge of what functions in Nixpkgs do, would be impractical for Nix to do for the same reasons they’re impractical in Python.

¹ To forestall any confusion, they’re not literally equivalent. There are several important practical differences, chief among them the fact that mkDerivation doesn’t immediately run anything when evaluated, but instead packages the script up to be executed during realization.