Nix team report 2023-04 — 2024-01

This is the second report the Nix maintainers team, covering the past 9 months. The team was created in October 2022, and we published the first report in April 2023.

This report was written by @fricklerhandwerk, and reviewed by @edolstra @Ericson2314 @roberth @thufschmitt @tomberek.

tl;dr

We made some progress in all areas, sketched a roadmap for CLI and flakes stabilisation, answered many mid-sized design questions, merged numerous small improvements, and welcomed a many new contributors. The code is getting easier to work with and we address contributions more consistently.

Still, our efforts still lack focus and we struggle with finishing things that were started. Although we’re converging on a well-defined direction and reliable processes for the project, things are moving rather slowly: limited maintainer capacity and still-growing system complexity have become pressing issues.

We’ve merged more than 500 pull requests by 128 contributors, and closed over 300 issues discussed by 214 people. The number of open issues has risen from ca. 2500 to 2700, the number of open pull requests from 300 to 350.

Progress on stated objectives

In the past period we took on a small set of high-level responsibilities and stated objectives against which we can measure the effectiveness our efforts. Overall, we have steady improvements to show for, and can claim a series modest successes due to continuous work. Much of this was only possible thanks to an enormous amount of volunteer contributions and ongoing organisational funding.

Set a direction for the development of Nix

With RFC 136 driven to conclusion by @Ericson2314 and @tomberek, and the corresponding implementation plan devised by the team, we have agreed a general direction for feature development, which could be summarised as:

Reduce the experimental feature surface. Incrementally stabilise what we can support indefinitely.

Specifically, the next steps are:

  • Stabilise the nix store CLI
  • Stabilise fetchTree, then the lockfile format and semantics, and later the other components of flakes

We’ve spent substantial effort to solve a number of procedural and design issues in that regard, and find that we reached the goal to the extent possible with the available time.

While this is a clear focus, we also follow our review priorities: regressions, bugs, and improving testing. This makes the team’s resource constraints even more evident, especially since there are currently ca. 1000, largely untriaged issues labeled as a bug. We still have to find and explicitly define a healthy balance between measures to increase sustainability on the one hand and feature development on the other hand.

Ensure that the code is in good shape

  • @thufschmitt picked up and finished multiple old pull requests that addressed relevant issues.
  • @Ericson2314 continued with his series of refactorings to make the code easier to understand and move towards RFC 134, many of which were reviewed by @edolstra.
  • @edolstra added support for libgit to eventually replace shelling out to git in fetcher code.
  • @roberth opened a large number of issues to expose and keep track of tech debt.

Overall there are lots of incremental improvements happening under the hood, and we expect them to eventually culminate in the unblocking of interesting changes, such as native Windows support. But there is still a lot more to be done before things will go smoothly. We ran into substantial, deeply-rooted architectural issues multiple times, such as:

These also affect contributor experience and development agility, including the stabilisation goals: good design is useless if it’s too expensive to implement without breaking things.

Improve the contributor experience

The triage process already works quite well, such that the team looks at all issues and pull requests that are brought to our attention. But this is only a fraction of what’s going on, and so far we were reluctant to automatically add all new issues and pull request to the project board. We may do this in the future.

We also noticed that assigned reviews still tend to get stuck. At the time of writing there were 225 open issues or pull requests assigned to maintainers (some of them have multiple assignees):

The time it takes to get pull requests merged keeps varying widely and on average does not seem to go down significantly. This is probably a symptom of the difference in availability of authors and reviewers, which tends to stretch interactions out so far that either side may lose track or motivation. It could be alleviated by faster response times to keep authors engaged, and more diligent prioritisation to keep reviewers engaged. But that is ultimately limited by maintainer capacity and competes with finishing work in progress.

Attract more maintainers

The team asked past contributors for help with maintenance, and @abathur and @cole-h volunteered for triaging. We decided to keep asking individuals for donating their time to help with reviews, but this did not turn out to work. Therefore:

We’re looking for volunteers to help with triaging by validating issues, curating labels, and suggesting priorities. Please get in touch to join he triage team!

We had a 66 new contributors and 62 regulars, many of which made substantial, high quality pull requests:

Unfortunately we can’t list everyone here, and of course the number of pull requests does not tell anything about their significance. Sure enough, even those who got only one contribution merged made a dent. Many thanks to everyone who helps making Nix better! Please gives us feedback on what we can improve so it’s easier for you to continue.

Notable design decisions

Notable pull requests merged

This is a selection of merged pull requests by label. It is manually deduplicated (since many have multiple labels) and filtered for what we consider non-trivial improvements.

Documentation (54)
New command-line interface (29)
Bugfixes (25)
Contributor experience (20)
Error messages (16)
New features (13)
Tests (11)
Development process (11)
Performance improvements (7)
Nix language and evaluator (8)
User experience (5)
Installer (4)

This overview allows for the following interpretation:

  • Documentation got a lot of attention, mainly due to Antithesis continuously sponsoring @fricklerhandwerk, and @Ericson2314 making infrastructure improvements.

  • Our focus on CLI stabilisation and accompanying communication turned out many incremental improvements and clarified (and also revealed) some design issues.

  • Our efforts to streamline the overall development process are bearing fruit.

  • The small amount of added tests is far behind what is needed to fix the staggering amount of bugs and inspire confidence for more ambitious changes to the code base.

    This is likely due to

    • uneven distribution of knowledge
    • accidental, historical complexity of the implementation
    • the amount of time needed to built up the relevant context
    • the fact that we didn’t spend as much time explicitly sharing knowledge as we intended.

    We discussed expanding on knowledge sharing on multiple occasions (e.g. doing more group reviews or recording videos), and also doing fundraising to buy C++ expert time, but none of that has seen tangible results so far.

  • We could still more consistently spend time on what we originally intended to do. But effecting significant change in any of the problem domains will certainly require cutting scope elsewhere unless we can increase available resources.

We may reconsider aligning our process for writing release notes with our development priorities, to make this kind of overview continuous and automatic.

Scripts to obtain the raw data for this report

Goals for the next 6-month period

Given what went well so far and which obstacles are still in the way, for the next period we plan to address the major issues identified in this report and finish work in progress.

Set a direction for the development of Nix

  • Finish stabilisation of nix store commands.
  • Fully implement RFC 134.
  • Conclude stabilisation of fetchTree and move on to a specification of the lockfile format.

Ensure that the code is in good shape

Improve the contributor experience

  • Allocate fixed amounts of time for each maintenance task and stick to them.
  • Refine release notes to credit contributors and provide more information on progress in the various areas of development.
  • Bring down the number of open issues and pull request to at most the level of March 2023.
  • Define an escalation path for disputes over maintainer decisions.

Attract more maintainers

  • Give more people commit access bound to clearly delineated responsibilities.
    Ensure that every subsystem has at least one dedicated maintainer.
  • Give more people triage access for helping with curation of the issue tracker and project boards.
    Ensure that every major topic has at least one dedicated person to respond to new issues within reasonable time.
  • Raise funds to add more developer capacity.
21 Likes

Thank a lot Nix team :bouquet::smiling_face_with_three_hearts:
Very nice to see noticeable improvements slowly landing :sparkles:

2 Likes