We need more defined guidelines for package inclusion

I’ve wondered about this, from outside the bubble. I’m probably not rooted enough to have a long coherent thought on the topic. So this is mostly oblique, addressed more to the maintainability side of this.

But I have wondered how/if the various nix commands could be enhanced to help build big levers out of small nudges. That’s really abstract, so some more-pragmatic examples… (I’m imagining manual commands with configuration options to make the behavior automatic in almost every case here; never default-on; IIRC there’s overlap between some of these and existing helper-bot behavior).

  1. If heuristics can minimize false reports, a command for reporting package build failures in an aggregable way, paired with inline and end-of-output prompts to use them. If the heuristics are tractable, I think there’s more than one lever lurking here.
    • Terse issue reports that reflect the heuristic confidence and collect specific details based on context are probably easier to triage than human-language reports by someone who isn’t quite sure what broke or didn’t copy the right part of the error stack.
    • A system fielding code-reported issues might be able to suggest fixes (i.e., generate a PR and start the testing process before a human has had a chance to triage), or even automatically report some classes of upstream issue when the information is available. Broken packages can be identified sooner.
    • Users running into already-automatically-reported build breaks can be given a URI for the issue thread to reduce duplicate effort on github, here, IRC, reddit, etc.
  2. Same/similar mechanisms might also help triage user errors (again, github, here, IRC, reddit, etc) without burning time/effort/patience of key community members.
  3. When someone/thing triggers a build of a nixpkgs package sourced from a well-known platform with easily-queried versioning (json/xml/rss/etc.) and the package version is out-of-date and there isn’t already a commit/report/PR, this could be automatically reported (in to nixpkgs, and out to the user). If there’s already an automated report on a package update and a new version is released, the issue can be updated with the latest version to save time triaging/shepherding updates for the fastest-moving packages. If it’s out of date but the new version is already in the pipeline, it could just let the user know (whether it’s in a newer commit on their channel, or whether it’s in the next channel update, etc.)
    • This could also generate a PR and start tests. Naively, the same dependencies. With a few adapters (I’ve been meaning to start a separate thread…) this could be at minimum sensitive to the fact that the files that might play a role in defining dependencies for a given stack have changed.
    • If these are derived from activity rather than imperatively generated by bots, the velocity and workload will (hopefully?) be better balanced around what people are using rather than purely on whatever publishes most often.
  4. A manual command for reporting issues with a package that built fine but isn’t running correctly. Maybe this is a command for wrapping the whole execution and trapping output. These probably don’t auto-open issues without meeting a threshold number of reports unless heuristics say it smells like a common nix-ecosystem error pattern. Optimistically, something like this (if it produces real/hard errors), could even record the syntax and turn it into a test.
  5. If reported back, versions explicitly-built from sources with a knowable version-format probably represent (maybe in context with general cache hits/misses?) some signal about how people use a package. It’s not worth rat-racing a package if the sanctioned nixpkgs version isn’t significantly more common than others once overlays/shells/etc. are accounted for. It could help triage towards packages where the nixpkgs version is clearly canonical.

I have some other thoughts, but I’ll hold off for tomorrow.

“Packages that will not be included: A package that is unmaintained upstream. Or stated by the author to no longer be used.”

There is a ton of packages that are either definitely unmaintained upstream or theoretically maintained with no commits, that just work and keep working. I would actually trust some of these more because of that.

Software maintenance is a symptom of problems, not something desirable per se.

But «likely to have dangerous problems discovered with no expectation of upstream fix» is probably indeed something to avoid (not all packages can realistically have meaningful hidden dangerous problems in them by now)

4 Likes

Software maintenance is a symptom of problems, not something desirable per se.

I don’t want to get too deep into the philosophy of software maintenance here, but I think this helps us only in the best of all possible worlds, and we’re probably not living in that one. Here and now, the overlap between “is actively maintained” (or at least: if a problem somehow pops up, an issue can be opened and will likely be dealt with within a foreseeable time-frame) and “it’s easy to see that this is a package that can be include without too much hassle” is almost complete.

Now, there are totally some packages out there that are only updated very irregularly upstream (maybe only 1-2 commits a year) and those should definitely be allowed to be included. As a first gauge, at the time of the init commit for a new package, that’s still a pretty good heuristic though. If somebody wants to include a package which hasn’t received any upstream development for a long time, I think the burden of proof is then on the would-be package maintainer to convince some reviewers that it’s include-worthy.

We won’t be able to come up with rules that will work for every single instance. There has too be some wiggle room. We also won’t get it right for every single package. But having some rules of thumb might help a lot. (A list of checks for new packages, akin to the checklist for every PR right now, might be enough already. A lot of PRs get merged without fulfilling every single check, and package inclusion would work similarly. A rough set of guidelines should do.)

1 Like

Sorry. What I wanted to get across is how the forces are being directed. Instead of enforcing policies, ask questions to the package submitter and let him make a decision for himself. It might sound like a detail but humans have a tendency of creating policies to solve problems, and this is where the community starts to deteriorate. Hopefully I am not going too meta or anarchist :slight_smile:

Making a guideline to inform the contributors is a good idea.

I think people submitting new package should ask themselves:

  • Are they willing to maintain that package or is it just something thrown over the fence? If they aren’t willing to maintain the package, it’s going to become dead code quite quickly.
  • Is upstream healthy? If there are no stable releases, the project might be too young. If it’s unmaintained it’s going to be more work for you.
  • Is the package going to be useful to others? If not, it’s fairly easy to maintain a private overlay (and binary cache).

Assuming that the submitter is not an asshole or psychopath, I think it’s fine to leave the answer of those questions to them.

8 Likes

Software maintenance is a symptom of problems, not something desirable per se.

I don’t want to get too deep into the philosophy of software maintenance here, but I think this helps us only in the best of all possible worlds, and we’re probably not living in that one. Here and now, the overlap between “is actively maintained” (or at least: if a problem somehow pops up, an issue can be opened and will likely be dealt with within a foreseeable time-frame) and “it’s easy to see that this is a package that can be include without too much hassle” is almost complete.

I don’t know, for some of the tools I use all the old problems are likely to be already known. Unfortunately, one cannot usually fully live in this world (typically because of browsers and office file formats), and perfection is rare, but for a lot of things one can comfortably use things that stay exactly the same without any changes for multiple years.

Now, there are totally some packages out there that are only updated very irregularly upstream (maybe only 1-2 commits a year) and those should definitely be allowed to be included. >

1-2 commits (or short series of commits) a year (with this commits changing something) should be enough to call something maintained, unless there are strong reasons to say otherwise (like serious bugs with no reaction at all).

As a first gauge, at the time of the init commit for a new package, that’s still a pretty good heuristic though. If somebody wants to include a package which hasn’t received any upstream development for a long time, I think the burden of proof is then on the would-be package maintainer to convince some reviewers that it’s include-worthy.

I think it is very good to have this be a thing that should be always mentioned. Proving against explicit guideline is a bit too high a burden of proof in my opinion.

We won’t be able to come up with rules that will work for every single instance. There has too be some wiggle room. We also won’t get it right for every single package. But having some rules of thumb might help a lot. (A list of checks for new packages, akin to the checklist for every PR right now, might be enough already. A lot of PRs get merged without fulfilling every single check, and package inclusion would work similarly. A rough set of guidelines should do.)

Yes, that’s a good entry for a checklist, that’s true for sure.

1 Like

I like @zimbatm’s view on this (at least as I understand it). What we really need in my opinion is

  • a way to treat an empty maintainer list for a package similarly to a broken flag.
  • a way to detect maintainers not doing their job. For example if there is a PR or an issue about their package and they did not respond in any way ~1month after being pinged, they are removed from the maintainer list.

That way we don’t have to reject anything based on a feeling that it may not properly be maintained. Users can opt-out of unmaintained packages (and maybe that should be the default). Prospective users can see that it is unmaintained and pick up the mantle.

We’d also need to define clear boundaries of responsibility for library packages. When someone updates a library, in my opinion it is their responsibility to test all (if feasible) reverse dependencies. It would be nice of them to fix errors that pop up, but not a requirement. Instead, they could open a PR, flag the respective maintainers and give them reasonable time to get their packages working with the new library version.

10 Likes

We’re on the same page @zimbatm :sparkler:
My thought is so similar it’s a little scary, but I think it’s a good thing.

Though I do feel the documentation should mention some restrictions that
we all can agree on internally as practices that ensure the well being of the project.
So I think the format of the docs could be split into the “Direction” and the “Principal”.

I’ve also thought that there should be a way to detect if maintainers aren’t maintaining.
But I’m not sure it’d be removing from the maintainers list, perhaps just for the particular package.
People are refering to meta.maintainers in many places and it appears they’ve stopped to contributing
long ago, so the information isn’t even up to date and they’re not removed. Not to mention ofborg will ping them and it doesn’t help.

That’s an idea I haven’t heard before :thinking: . I’m not sure what I think about dividing nixpkgs in that way, or if it’s needed or would be beneficial.

3 Likes

Just yesterday I discovered the nix-shell maintainers/scripts/update.nix --argstr maintainer zimbatm command. Surely that makes me a bad maintainer :smiley:

@worldofpeace do you mind dumping all the things you think a maintainer should know in a document? Just make a new git repo so we can all contribute and then once it’s ready we’ll figure out where to put it. What does it mean to be a maintainer, what should I look out for, what type of tools are available to me?

Another idea is to create an on-boarding experience for new maintainers. Right now most new maintainers are added without ceremony. We trust that they are going to find the information in the nixpkgs manual somehow. I think we should ask them to first read the maintainers page^ and confirm that they understand what it entails instead. That way it will create a more uniform view of how the maintainer system works.

9 Likes

Just make a new git repo so we can all contribute and then once it’s ready we’ll figure out where to put it.

This seems like a great first step. Something similar, and also currently being worked on, is the Contributing guide in GHC, so we can probably get some inspiration from there and from other projects’ similar pages.

Huh, didn’t know that one either. Another protip is to subscribe to your repology feed. Although in the very long run I hope r-ryantm can prelace that completely, by opening issues (with maintainer ping) for stuff it can’t update itself.

2 Likes

!!! @zimbatm

Totally :fire: We’ll see if I can string these threads of thought into something useful.

Excellent, this could be really simple too. It would great if GitHub could just expose that for us.
Or if being added to Sign in to GitHub · GitHub gave you an email
with that content.

We have Nixpkgs 23.11 manual | Nix & NixOS chapters 13-15 which are essentially this.
But as raised here, this documentation is completely practical.

If moved-on/inactive maintainers are a systemic problem, I wonder if the default could be to remove maintainership automatically from time to time unless the user takes one of a few affirmative actions?

Might have a lower chance of alienating a maintainer relative to a manual challenge process.

2 Likes

Automatic garbage collection :wink: .

In some previous live, I was a NetBSD developer. In NetBSD your developer privileges would be revoked after a year without commits. There was a gentle warning some time before the expiry date. I think it generally worked well, because it does not feel like someone is trying to force you out of the project, but an automatic proces. If someone wanted to stay onboard and just didn’t have much time over the past year, they’d just make a small contribution.

2 Likes

What happens if you maintain a single package that only has releases less than once per year?

Any kind of maintainer timeout needs to take into account the actual release schedule of the package. It shouldn’t be “this maintainer hasn’t submitted anything for X time period”, it should be more “this maintainer has let their packages sit outdated without doing anything about it”.

Of course, a rule like that can’t be terribly strict either. Maybe the maintainer is the only user of a particular package and so only updates it when they want to. Or maybe a package has a new release that’s broken and the maintainer is just waiting for upstream to get around to fixing it.

4 Likes

I think we’re mixing two entirely different concepts here: maintainers and committers. One may be either without being the other (though commiters that don’t maintain anything are probably rare).

For maintainers, it makes sense to remove them only if they slack in their duties. And I only see a reason to remove them at all if we combine that with some mechanism to filter unmaintained packages. For committers, I agree that something based on the last contribution might make sense, but that probably belongs to a different thread (maybe the maintainer discussion already does, though its closely related to the OT).

I guess you’re responding more to danieldk, but all you need do is take a minor affirmative action to confirm you exist and are still interested in maintaining the package. It’s a backstop; the action can be small/simple.

An enforcement bot could be strict without being draconian. Perhaps it opens a pull request removing maintainers that haven’t committed over some interval (perhaps in a separate repo?), mentions them in the message, and gives them a month to reply (or thumbs-up/down, etc.)

I guess it could even automatically re-add you if you respond later on.

I mostly agree if “missing” maintainers aren’t really a significant issue.

Maybe not quite with the filtering part? It seems like being able to strike inactive maintainers provides a route to enabling build commands to let people know when they’re building something that depends on an unmaintained package, and nudge them to help maintain it to ensure it gets timely updates or doesn’t end up on the chopping block.

But if missing maintainers are a drag, it seems like formalizing and de-personalizing the removal process should help protect the emotional energy of whichever core members would otherwise end up initiating removal.

1 Like

Only in the Nix community: a long thread where everybody is working together to improve things instead of fighting over who is right or wrong. <3

If we are working on the “user experience” of the package maintainer, there are probably a few more things that we can do:

  1. There should be a way for issues to be mapped to maintainers automatically. – This will help users connect to the right person who has experience with the package.
  2. Maintainers should be able to merge PRs touching only their package if ofborg is green. – This should help reduce the workload of nixpkgs committers by a large margin.
  3. Maintainers should know how to setup their package so they work with ryantm’s updater script. – This reduces the maintainer’s workload, especially combined with (2).
  4. Each package should have their own reporting page with number of issues / PRs open over time. – For better transparency and also for users to decide if they want to adopt a package or not.
  5. The maintainer role should have a clear definition. – This should help make sure everybody is on the same page.

We don’t have to implement everything at once. Each of these points would help make the maintainer and user experience better.

10 Likes
  1. There should be a way for issues to be mapped to maintainers automatically. – This will help users connect to the right person who has experience with the package.
  2. Maintainers should be able to merge PRs touching only their package if ofborg is green. – This should help reduce the workload of nixpkgs committers by a large margin.
  3. Maintainers should know how to setup their package so they work with ryantm’s updater script. – This reduces the maintainer’s workload, especially combined with (2).
  4. Each package should have their own reporting page with number of issues / PRs open over time. – For better transparency and also for users to decide if they want to adopt a package or not.

And if we have all that nice tooling for handling the maintenance roles, the question of inactive maintainers might end up being auto-defined: a maintainer is inactive if maintainer feedback was requested but never happenned in any form. So we wouldn’t need any special casing for a pause in upstream releases.

1 Like

It sounds like we’re talking about the same thing. What I meant to say is that removing inactive maintainers is only really useful if we actually use the information somewhere. That could be something that acts similar to meta.broken in that it fails (opt-in or opt-out) to build/fetch unmaintained packages. Or print a warning.

1 Like

Failing a build because of an inactive maintainer means a package that worked perfectly well for years could suddenly stop building with no changes made to it because its maintainer was marked as inactive.

1 Like