Wild idea: How about splitting nixpkgs and nixos?

I mean, can’t it be done in a right way that preserves the gains (not only the ones exposed here) and limits the odds? It would be enlighting hearing somebody who has all the arguments against a split in his pocket argue for it.

It seems to me the status quo is a local maximum and does cause some cognitive dissonance.

I’m mainly (theoretically) concerned with poor public (visible) interfaces between the two (in principle) independent entities (in code, documentation, maintainers, users, …). I can’t explain in great detail, but it feels annoying and bad design without a potentially good (enough) justification.

NixOS and NixPkgs have independent value propositions of their own right. An upstream-downstream relationship (also socially-organizationally speaking with all the theoretical benefits in increased division of labor and authority): why not?

Organizational Theory teaches us: compartimentalize the problem and pushing responsibility to the leaves is an excellent strategy to deal with complexity in complex systsems. In sort: “divide & conquer”.

  1. pushing core changes through, which otherwise may not happen due to the huge burden

Point 1), that is I think a good reason to consider smaller repo’s. Its just a pain to make a core change. Of course, you should always be careful with those kind of changes, but the burden that exists now is too much shifted to those working with the core packages.

On the other hand, if we cannot reach a decision when a core change should just ignore the leaf breakage (or when a copy of the old thing needs to be temporarily pinned) in any other way but making this breakage harder to see because of repository split… that would sound very dark

NixOS and NixPkgs have independent value propositions of their own right. An upstream-downstream relationship (also socially-organizationally speaking with all the theoretical benefits in increased division of labor and authority): why not?

Large enough codebases end up as monorepos vendoring the dependencies. We just have the luck of declaring the vendored copy the upstream.

1 Like

For me this is a strong argument in favor of maintaining nixpkgs strictly separately. NixOS is the main consumer, but not the only one. An enforced boundary would induce more awareness of change management and help treat other downstreams such as nix-darwin or home-manager with even more respect, hopefully easing integration for everyone by leveling the playing field.

Merging nixos-hardware into NixOS on the other hand makes perfect sense, because that is about configuration under Linux, which is one of the central subjects of NixOS.

7 Likes

tl;dr: Nope. Please don’t unless you have found tons of free (well paid)
labour.

I am not sold on any of the ideas regarding splitting. I actually thing
the downsides will weight out the “advantages”. Synchronising between
two repositories sounds like a nightmare. I’ve done it with more than
two and it is always ugly. Even with flakes that wouldn’t be nice.

It is nice to post a single PR that refactors how an interface (e.g. of
systemd) is used throughout the entire repository in an atomic fashion
(one merge commit) instead of having to do two code reviews with two
audiences across repositories.

Another example is fixing packaging and the corresponding module in
NixOS and adding a proper test in a single PR. Theoretically you should
only start working on the NixOS changes once nixpkgs has adopted your
changes but those changes might break current NixOS… No thanks.

If eval times for CI are the main issue we should perhaps look into a
(more granular) eval cache or major speedups in that area. We would all
benefit from that.

Now some users might say that we can use flakes and get caching for each
of the repos (making CI times faster) but those are neither specified
nor stable. The caching there also only works per git repository and not
per content hash of a directory/flake. Not an option without further
work in that area.

I’ve had thoughts about composable flakes within one monorepo for a
while (mostly from a UX perspective) over here:
https://github.com/NixOS/nix/issues/4218
We might be able to compose Nixpkgs with something like that without
effectively moving most of the code into just one big flake.nix and
flake.lock file. This obviously depends on per flake caching to work
sufficiently well.

I’m mainly (theoretically) concerned with poor public(visible)
interfaces between the two (in principle) independent entities (in
code, documentation, maintainers, users, …). I can’t explain in great
detail, but it feels annoying and bad design without a potentially
good (enough) justification.

NixOS and NixPkgs have independent value propositions of their own
right. An upstream-downstream relationship (also
socially-organizationally speaking with all the theoretical benefits
in increased division of labor and authority): why not?

I would say that the only public interfaces are those defined in the
(generated) documentation. All other interface can be changed at any
time (hopefully not in a stable release). This gives us a lot of chances
to improve the code over time. The users have the module system and
language specific tooling at their disposal. If they need more
guarantees they should probably make the move to stabilise some
versioned API (with tests) that they can then try to support going
forward (as part of nixpkgs). If that works well enough (both
technically as well as with manpower) we could consider comitting to
some APIs and perhaps splitting things. Before that has been shown I do
not believe that splitting repos magically reduces overhead, makes
everything move faster and better.

It is (again) a lot of work to maintain that API while still allowing
several improvements to our packaging system on an ongoing basis. We
aren’t just all sitting there and adding more packages all day. I know
that a few are actually doing a lot of work to make those language
ecosystems work (better) over time. If neither of those people wants (or
has the capacity) to add a long-term stable API it will not happen.
Regardless of Discourse, RFCs, IRC, … or other discussions.

If you want clearer separation between parts and pieces I recommend you
do define

  • where you would “cut”,
  • how you handle versioning of that interface,
  • how you can enforce that interface and provide good developer
    ergonomics when it breaks or wasn’t properly used,
  • and to write a bunch of tests that ensure that the guranteed API is
    indeed still performing as specified.

Once we are through the above for packages, package ecosystems (e.g.
buildPythonPackage, haskell, php, C aka stdenv, …), lib and whatever
else there is between the two we can think about splitting repos. This
would then still only guarantee that we have an interface but not that
we can actually do all the work required to keep both projects in sync
(both code and people).

Do not get me wrong: I would love the above interface as it would
improve the experience when dealing with packaging individual projects
(inside and outside of nixpkgs).

12 Likes

An interesting data point could be mobile-nixos, which is maintained outside the mono repo. Maybe samueldr can share his experience?

1 Like

The first step might be to move application software (browsers, jetbrains, then {go,rust,java,ocaml,haskell} + everything written in the languages, then everything built with trivial callPackage {}) to some kind of AUR with rolling updates.
It seems easy and safe, and will reduce the load on release maintainers.

Separating NixOS|Nixpkgs is more tricky. Where would GCC be?

2 Likes

I agree with your argument in general. A monorepo avoids much overhead, which is significant with limited resources. You expressed what I missed to add in the other comment: you can of course maintain those interfaces within the monorepo.

Thanks for your thoughtful elaboration. I don’t have any opinion or agenda here, and like @turion hope this discussion will produce some insight. It’s more important that stuff works, not that much how we get there.

Intuitively I follow the idea that reducing the number of maintained packages should reduce the effort of producing a release. But that is where I fear more of a mess and duplicated effort to get the complete ecosystem building. What would happen to the unique feature of having the whole world inside nixpkgs? Wouldn’t we need another layer that combines all of the disparate collections back into one?

Every now and then the discussion pops up how to better compartmentalize responsibilities. Maybe this is more of a social problem than a technical one?

Excuse my ignorance: Why should gcc not trivially live in nixpkgs? What else does NixOS do except arranging files in a certain manner that an operating system will boot out of them, some of those files being binaries from nixpkgs.

3 Likes

@volth

The first step might be to move application software (browsers, jetbrains, then {go,rust,java,ocaml,haskell} + everything written in the languages, then everything built with trivial callPackage {}) to some kind of AUR with rolling updates.
It seems easy and safe, and will reduce the load on release maintainers.

That has the same problems as noted above: you update a trivial C library, which breaks its Rust binding, you update it to fix, which breaks a Python application depending on it, you update that one… etc etc

Today, at least theoretically, you have a chance of making all of that atomic, and meta.tests exists exactly for the purpose of helping to make such things atomic.

With the repository split as you propose, working meta.tests between the parts would be impossible.

@turion

IMO, as I continue to repeat every time this topic comes up, what we actually need is single-tree-in-multiple-repositories model of the Linux kernel, where each subsystem has its own repository (and mailing list/issue tracker/wiki/firmware database/whatever else the people of the subsystem in question need) while all the code still lives in a single tree to make atomic changes across the whole thing possible.
This is trivial with mailing lists (just add a Cc: to your [PATCH] email), but currently impossible with GitHub (but imagine how cool would it be if you could ask GitHub to force a set of PRs (into the same or different repositories) to be merged at the same time into their corresponding repositories as soon as all the PRs “agree to merge” and there are still no conflicts anywhere, but alas).

In principle, GitHub, being a platform that hosts like 80% of all libre and open source software, could make this work even across completely unrelated repositories: just allow PR authors to specify dependencies between PRs and specify which ones need to be merged at the same time.

Note however, that deterministic testing for non-single-tree to-be-merged-at-the-same-time-PRs is a non-trivial endeavor anyway since the algorithm looks something like this: take all the linked PRs, apply them to their corresponding repositories, figure out how those changes correspond to changes in reverse-dependencies (like in Nix case you would need to take the commit hash of the resulting merge and substitute it into fetchFromGitHub in the corresponding Nix-expression in Nixpkgs repository; note that your to-be-merged-at-the-same-time-PRs can’t specify those since they don’t know any commit hashes before all the repositories agree to merge), build, test.

6 Likes

Don’t forget that the bot is imperfect. Also, updateScripts can be imperfect as well.

Upstream unit tests should be part of the checkPhase or installCheckPhase. If you mean a “user scenario”, this can be acheived today doing passthru.tests.<unit> = runCommand "..." '' '';
Ofborg will run anything under passthru.tests

I don’t agree with this, some packages may also require nixos release notes changes. Or a package may have some regressions which aren’t captured in the package’s release notes.

I actually really like this for packages such as postgresql. Some packages have many dimensions of compatibility, and it’s hard to determine regressions without integration-like tests.

I think this is the right model, if I depended on a channel to enable my computer to boot, I would hope that I would have a guarantee that was the case. Also, channels are usually blocked because of arm/darwin build delays.

x86_64-darwin 	18311
aarch64-linux 	6304
x86_64-linux 	80
i686-linux 	20

If we dropped (or significantly increased) support for darwin and arm, we could have multiple channel releases a day.

Overall, I think our current monorepo provides more benefits than negatives. Nixpkgs can be thought of as a “database of expert knowledge”, and being able to query information about your changes goes a long way to determine the scope and complexity of what you’re doing.

There’s also the issue of liability and governance. If an organization is going to go “all in” on nixpkgs, having automatic merges isn’t great from an accountability perspective. Ideally you would want someone to have reviewed the package changes. I think we should focus on making reviewing the changes easier, for example, nixpkgs-update will post the changelog url, if meta.changelog is available.

10 Likes

Meta

Seems like most of the storm has passed, so I’ll answer to a few points and try to collect the most important learnings.

First, thanks for your detailed and diverse input! It’s become clearer to me that many people have different viewpoints on where the project should go, maybe partly because everyone is focussing on different parts and aspects of nixpkgs.

Second, it seems clear that no RFC is going to come from this any time soon since the necessary work would clearly be huge and the ratio of advantage to disadvantage is disputed.

Lastly, what I’m after is really making nixpkgs better to maintain so that more new maintainers will enter, and the existing maintainers are less likely to be frustrated. Anything that helps that goal is good.

Some comments

Unit tests

Yes. That’s what I tried to refer to as “unit tests” (because I think that’s what they are).

Automated merges

Some packages, on some version bumps, yes. If GCC is going up a major version then I guess this is worthy of a release note. But if VSCodium goes from 1.52.1 to 1.52.2 then it is not, right? But PRs like the latter make up the bulk.

For substantial changes, yes. For minor changes, I trust a well-written test and automated merge more than a human review without a test. (Anecdote: When I try to sell NixOS/nixpkgs to my company, the negative responses are typically 1. Security updates don’t come fast enough 2. nixos.com is a porn site. Can’t really deal with 2., but 1. is at least tangentially related with too slow advancement of channels.)

So if I focus on the core issue that is bothering me, and try to solve that, I would propose: Have automated merges if the following strict conditions are satisfied:

  • A single package is bumped by an update script, only versions & hashes change
  • The package follows PVP and the version bump was minor
  • There is a unit test

If this was completely automated, I think we would still catch errors through human review at the major versions, and through regular usage. But we would free a lot of maintainers from menial tasks and allow them to do interesting maintenance work! Such as reviewing bigger changes and solving issues.

Some learnings

Github limitations

That’s a new perspective that I hadn’t heard about before. Until now I thought that nowadays a platform like Github or Gitlab is the only way how you can organize massive collaboration (the Linux kernel simply sticking to older tech because of historical reasons), but yes, now that you mention it the standard PR model brings a bottle neck.

A countermeasure are merge trains, either automated like Bors or manually like haskell-updates (and possibly other language frameworks as well?).

“Organizational” reasons

[quote=“fricklerhandwerk, post:10, topic:11487”]

I definitely concur with that. Other nixpkgs consumers are often treated as an afterthought. I think the comparison with home-manager is good: NixOS and home-manager are two ways of building an environment for a user with nixpkgs. Of course NixOS is larger, older, and more widespread, but qualitatively this is true.

Yes, it is! In a nutshell, I’m proposing to manifest this distinction by making the two channels into two repos.

Testing across the ecosystem

The worst degradation of test quality would probably come in packages that are mainly services. Sqlite may be tested superficially with a few command line queries, but a distributed database service or a web application like Nextcloud really has to be tested as a service.

Test improvements and eval times

I’ve probably misjudged the distribution of CI times. If splitting off NixOS will not make it easier to have green pipelines all the time, then an important advantage is missing for me. But how can we make evaluation faster? How can it even be cached?

7 Likes

GCC is a Nix package, obviously. Even Linux kernel is a Nix package.

NixOS should be ideally seen as a set of Nix-glued scripts that generate a Linux distro. In this sense, I would even treat NixOS as a

nixos = callPackage ../distros/linux/nixos { };
kfreebsd = callPackage ../distros/freebsd/ { };
...
3 Likes

It is already so.+ NixOS adds modules and options to tune systemClosure derivation, that prevents exposing it under pkgs.

2 Likes

I see all the “let’s split” initiatives as “I did so many changes so I need to fork, but the whole beast is too big to maintain its own fork”

Example of an instance, where the requirement of same-PR priority between pkgs and nixos upstream-downstream relationship is heavily penalizing other possible down streams.

I did an experimental carve out of nixpkgs.lib for flake users.

4 Likes

I appreciate the experiment, there is nothing better than going hands-on. Let’s see, adoption will tell how much value people attribute to it.

3 Likes

It would also be (academically) interesting to extract pkgs/pkgs-lib and it’s secret twin the pkgs/build-support, conceptually they are on the same layer of packages dependent library functions. They could well go into the same conceptual bucket as builtins.

Can anyone enlighten me about the associated costs of providing pkgs dependent library function with different pkgs version than the (potentially the same) packages they actually would build? (besides the obvious downside of increased closure size).

1 Like

In other words, is there a significant cost to building a different version of git than the version that was used to fetch it’s source?

1 Like

Just found a comment that ties in here:

Further down the thread people express their viewpoints against splitting, but this conversation is embedded in an issue with a slightly distinct topic so the comments are spread out.

8 Likes