Flake Design of System/OS

jeff-hykin · December 9, 2023, 4:16pm

This is a genuine question.

Why isn’t `system` an input for flakes?

I’m sure there are some good design reasons for it, but I can’t find discussions about it.

e.g.

{
  inputs.system = {};
  outputs = { system }:
    builtins.trace system; 
    # value defaults to builtin.system
    { packages.default = stuff; }
}

or even

{
  inputs.system = {};
  outputs = { system }:
    { packages.${system}.default = stuff; }
}

nrdxp · December 9, 2023, 5:14pm

There were a lot of discussions about this years ago. Unfortunately I cannot seem to track down the discourse threads to link to atm. I can tell you that in an effort to make dealing with the system much more ergonomic, I extracted some logic originally designed by @blaggacao for std into it’s own standalone project nosys to basically write flakes as normal, but without having to deal with systems explicitly. It also allows for defining the systems as a true flake input.

There is also the nix-systems abstraction which can be used as inputs for nosys or just as a standalone abstraction.

jeff-hykin · December 9, 2023, 7:31pm

That’s really cool. I saw nosys a couple days ago when researching this, but I didn’t understand the use-case until you explained it just now.

If I’m understanding correctly, I can basically just pretend system is an input! That’s a big deal for me.

That said, it makes me even more interested in why this isn’t designed into flakes themselves.

There were a lot of discussions about this years ago. Unfortunately I cannot seem to track down the discourse threads to link to atm.

I had a really hard time finding anything. “flake” and “system” are just so generic that there’s way too many hits. I hope someone will be able to find them as I would really like to know.

Infinisil · December 9, 2023, 9:35pm

https://github.com/NixOS/nix/issues/3843

ElvishJerricco · December 9, 2023, 10:22pm

pops popcorn

This is one of the contentious things causing flakes to be controversial.

It’s an enormously frustrating design decision IMO. Like, nixpkgs has a wonderfully nuanced mechanism for describing the build/host platforms, and flakes reduces it to “I guess you only care about an architecture”. Worst of all, when you use some flake, you can only use the specific platform that it has defined outputs for. If it builds for riscv but they didn’t throw that in their flake.nix, you’re out of luck and have to fork it. If you want to cross compile but they only defined native compilation in their flake.nix, you’re out of luck and have to fork it.

You’re right to question it. It’s a terrible design.

roberth · December 10, 2023, 3:45am

Yes, this issue is already in the Flakes milestone for good reason.

jeff-hykin · December 11, 2023, 2:32am

(Thanks for the link @Infinisil! I learned a ton from that)

_

I don’t want to re-open a feud, but I do need to continue the conversation if I’m going to understand.

Edolstra spends a lot of time thinking about design, and has more experience than any of us AFAIK. I’m not saying he can’t be wrong, but I would like to be VERY confident I understand what he is saying before disagreeing. Already the more I dig into the parts I don’t understand, like hydra, the more I see his point of view.

_

My Best Summary of (that Discussion)

I’m going to put a on the ones I’m confused by

Preface: There are actually three discussions
- Why is system not a typical/conventional input (but not special)
- Why is system not a special input
- Why is system needed in the output packages.${system}.name
Enumeration (comment)
- Currently every flake input has 1 universal default value
- Because they’re pure, this means flakes also have 1 universal default output
- If system is a special input, then this is no longer the case; the default contents will be different depending on who is looking
- (Note: I’m guessing that having null as the default system-input value is seen as not helpful)
System-as-an-input breaks package access (comment)
- Lets say input.system = system1
- A big flake, like nixpkgs, needs to handle cross-compiled packages (inner-systems)
- Well package1 might need to access to package2-for-system2, package2-for-system3, etc.
  But it can’t really access package2-for-system2 because input.system is locked to system1.
- (I’m guessing that a recursive call of the flake is looked at as unfavorable)
- (I’m also guessing that making package2 be a function that that takes system as an input is seen as impractical, but I’m not sure)
The main purpose of flakes is caching (aka hermetic evaluation). System-as-an-input breaks this (comment)
- the path/result of packages.linux.thing is the same, even if its (theoretically) evaluated on non-linux machine
The flake specification is supposed to be a generic pattern; a tool to manage inputs and create a lockfile without much extra structure. System as a special input would break this pattern (comment)
- Creating a workaround such as "systems = " (at the same level as inputs =) ruins the generic pattern
Flake outputs should work, not “might work” (comment)
- If flakes are a function of system, then, without an additional mechanism, trial-and-error will be the only way to know which systems it supports
Evaluation caching becomes harder (not a huge problem for system, but would be a problem for arbitrary arguments). (comment)

I took a lot of liberty in those summaries, so please correct me if I’m wrong.

_

Questions/Clarifications

If I can’t get these cleared up I’ll ping Edolstra.

I am now looking at this from the perspective of sometimes having system as normal input; not necessarily changing pacakges.${system}.name, and not necessarily having system default to builtins.currentSystem.

The most important reason I’m confused by is the mix of #6 (caching becomes harder) and #3 (hermetic evaluation)
- My current understanding is flakes are basically a pure function with extra structure
- We can cache the output of every flake input, right?
- Is it harder because of needing to enumate the input arguments or is the downside that a higher number of small cache entries is worse than fewer but larger cache entries?
- In #6 I don’t have a good idea of what “arbitrary arguments” would look like. Was the comment in reference to commandline flake installation?
I agree that advertisement/enumeration is incredibly important and extremely underrated. Indexability cannot be an afterthought. My (again genuine) question is, what would the consequences be if we advertised testedSystems without using pacakges.${system}.name?
- To keep hermetic evaluation, lets say input.system defaults to null.
- Metadata of the flake shouldn’t depend on system, so we can still evaluate metadata
- What are some the tradeoffs of having the flake lock perform the following:
```
let
  testedInputs = (
    outputs {} /* <- system defaults to null */ 
  ).meta.testedInputs;
in
  map (each: outputs each; }) testedInputs;
```
  For those who say “theres no difference”, I can think of at least one:
  - It doesn’t allow for attribute autocompletion
  Maybe this is what was meant by arbitrary arguments.
  (I’m still not sure how it would hurt caching)
  E.g. { cuda = true }, { cuda = false } instead of torchWithCuda and torchWithoutCuda
What are the downsides of having nix tooling (nix-env, nix develop, etc) support outputs that sometimes don’t contain a system attribute?
- say, for example, there was packages.null.name
- I don’t think all flakes need system as an input
- I think nixpkgs, the lib flake, and a font package are all good examples of not needing system as an input.
For “flake outputs should work”, I think there’s a bigger point trying to be made thats going over my head.
- I’m sure Edolstra knows that, even without system as input, some packages claim to work on a system, but don’t actually work on that system. And also the reverse; some claim to not work on a system, but actually do work on that system.
- My question is, does “flake outputs should work” mean that there should be a finite set of possible arguments? Or is it moreso talking about testing, where explicitly including a system makes it more likely that someone tested it.
For #2 (cross-compiling access)
- I think this is a really compelling argument. At minimum, having system as an input would make it inelegant to access their cross-compiled counterparts.
- Was part of the reasoning of #2 that it would be bad design to have a flake call itself with different inputs?
- Is some of the reasoning that it would be impractial to chanage all individual pkgs within in nixpkgs to be a function of system?

iFreilicht · December 22, 2023, 9:40am

I think this is a solvable issue. If you squint a bit, system is already a special input, it’s just very strictly controlled by Nix itself. However,

That is very correct. There needs to be an additional mechanism.

The issue here is also that a package can have multiple systems as its input; the one it’s built on (buildPlatform in the context of nixpks), the one it’s built for (hostPlatform) and (in the case of some compilers) the one it will produce binaries for (targetPlatform).

Right now, flake’s system is both. If I want to build a x86-64_linux output on my aarch64-darwin machine, I have to set up a VM or a server acting as a remote builder. The only other option is for the flake to specifically have an additional output for every potential target, like I did with nix-nar-rs.

It would be very cool if packages.x86_64-linux could just be cross-compiled when I try to build it, but it just doesn’t work that easily. But maybe that’s what we should be aiming for?

I feel like it’s too late for that. Having inputs, outputs, description and nixConfig, which are required to be of certain types, is already extra structure. If the design goal was to impose as little structure as possible, nix would have to do something like niv and allow to lock basically any nix file.

Flakes become useful because of their structure.

To me, the solution would look something like this:

{
  systems = [
    "x86_64-linux",
    "aarch64-linux"
  ];

  outputs = { self, nixpkgs } : {
    perSystem = system: {
      defaultPackage.${system} = nixpkgs.${system}.hello;  
    };
  };
}

Which is still hermetic and can still be enumerated independent of the current platform:

$ nix flake show --all-systems
git+file:///Users/feuh/helloflake
└───defaultPackage
    ├───aarch64-linux: package 'hello-2.12.1'
    └───x86_64-linux: package 'hello-2.12.1'

But if you want to override the system input (maybe because the flake author didn’t consider your system), you can:

$ nix flake show '.?system=aarch64-darwin'
git+file:///Users/feuh/helloflake?system=aarch64-darwin
└───defaultPackage
    └───aarch64-darwin: package 'hello-2.12.1'

This does not break hermetic evaluation, because you’re basically creating a new flake, which does not depend on the system it is being evaluated on. Maybe it makes sense to add a --for-current-system flag to make this a bit easier to discover, but that doesn’t really matter for the conceptual discussion.

It also doesn’t break outputs that do not depend on the system, like nixosConfigurations.

I feel this mechanism could be used for adding generic overrideable inputs to flakes, but that might be going a bit far for this discussion.

I interpret that as meaning “we shouldn’t just assume any flake works on the five default systems”. Which is true. It would be annoying to run nix flake show, see your system is supported, but then the evaluation or the build fails.

psionik · March 12, 2024, 3:39am

The genAtters solutions floating around also leads to some impedance even without cross compiles. When defining the outputs with something like forEachSystem, the user will be fighting the code structure if they didn’t declare all of their outputs as functions of system. This amounts to requiring re-structuring logic within the flake.

Even then, for the cross compile, since we can’t have nested attribute sets, we end up appending the host platform. While it also makes it less convenient to select the cross outputs for a particular cross system, the bigger issue is again requiring restructuring logic within the flake.

Finally, omitting paths entirely for one system or another requires restructuring to remove the null attributes or else get error: expected a derivation. The solution is to embed more re-structuring logic into the flake.

If I could suffer enough amnesia, I may prefer flakes to declare functions instead of attribute sets. The functions would be consumed by passing in either the declared or custom set of systems and cross systems. Outputs is a function. Why shouldn’t it also return a set of functions?

But I want to stress, the need for restructuring logic within flakes and the relationship with these system annoyances is my biggest complaint. This is what pushes all the wheel reinvention and annoyances for new nix users in writing and understanding what’s written within flakes.

domenkozar · March 12, 2024, 6:30am

I’ve written an extensive report of why system should be an input in 2021 and gave up a year later that it will be addressed, thus made https://devenv.sh with the correct interface.

arianvp · March 12, 2024, 6:34am

I don’t understand the argument how system-as-input breaks evaluation caching. Functions are trivially memoizable.

function memoize(f) {
  static results;
  return function(x) {
    if results[x] return x
    else return  (results[x] = f(x))
  }
}

So if we have enumerable amount of systems, having system be an input sounds like an obvious and easy to implement win

jeff-hykin · March 13, 2024, 4:47pm

Let’s

Use a full example (so it can be criticized)
Address issues sequentially (e.g. not all at once, and not bouncing-around)
Focus on actionable aspects instead of ideological ones

Minimally-invasive is my objective for this example.

{
  inputs = {
    # defaults to { value=null; }
    mode.url = "github:jeff-hykin/snowball/6b9d5dcaf2f685f90f02058f059fe818098171d5";
    # defaults to { value=null; }
    system.url = "github:jeff-hykin/snowball/6b9d5dcaf2f685f90f02058f059fe818098171d5";

    pkgSource1.url = "somewhere";
  };
  
  outputs = { self, mode, system, pkgSource1 } :
    if mode.value == null then
      {
        # normal/legacy output (backwards compatible)
        defaultPackage.x86_64-linux.default = pkgSource1.packages.x86_64-linux.cowsay;
      }
    else if mode.value == "enumerate"
      [
        # system isn't special, any input can be in the attr set
        { system = "x86_64-linux"; }
        { system = "aarch64-linux"; }
      ]
    else
      # a new approach
      {
        defaultPackage = /* somethin with pkgSource1 */;
      }
  ;
}

That^ might solve a lot of things, but we really should only talk about what it doesn’t solve.

Issue #1 how will system (and other enumerated-inputs) be passed along (e.g. `pkgSource1`’s system)?

output gets called once with mode={value="enumerate";}
output gets called again with mode={value="eval";}, and system="x86_64-linux"
How do you think system should get passed down into pkgSource1.
NOTE some package don’t take a system input (in theory), and some packages take a hostSystem and targetSystem input

Flake Design of System/OS

Why isn’t system an input for flakes?

My Best Summary of (that Discussion)

Questions/Clarifications

Issue #1 how will system (and other enumerated-inputs) be passed along (e.g. pkgSource1’s system)?

NOTE some package don’t take a system input (in theory), and some packages take a hostSystem and targetSystem input

Why isn’t `system` an input for flakes?

Issue #1 how will system (and other enumerated-inputs) be passed along (e.g. `pkgSource1`’s system)?

NOTE some package don’t take a `system` input (in theory), and some packages take a `hostSystem` and `targetSystem` input