Pre-RFC: Package advisories

Summary

Introduce an “advisory” system into Nixpkgs, similar to broken and insecure, but with a custom per-package message. This will then be used to warn users about packages that are in need of maintenance. Packages that have an advisory for some long time should eventually be removed.

Motivation

Nixpkgs has the problem that it is often treated as “append-only”, i.e. packages only get added but not removed. There are a lot of packages that are broken for a long time, have end-of-life dependencies with known security vulnerabilities or that are otherwise unmaintained.

Let’s take the end of life of Python 2 as an example. (This applies to other ecosystems as well, and will come up again and again in the future.) It has sparked a few bulk package removal actions by dedicated persons, but those are pretty work intensive and prone to burn out maintainers. A goal of this RFC is to provide a way to notify all users of a package about the outstanding issues. This will hopefully draw more attention to abandoned packages, and spread the work load. It can also help soften the removal of packages by providing a period for users to migrate away at their own pace.

Apart from that, there is need for a general per-package warning mechanism in nixpkgs – one that is stronger than builtins.trace.

Detailed design

Package advisories

Two attributes are added to the meta section of a package: advisory and advisoryDate. The advisory is a string message describing the issue with the package. It is free form text, but should follow the following format category: description. Linking issues and pull requests is encouraged. Example values:

  • deprecated: This package depends on Python 2, which has reached end of life. #148779
  • removal: The application has been abandoned upstream, use libfoo instead

The advisoryDate is an ISO 8601 yyyy-mm-dd-formatted date from when the advisory was added. It must be present if and only if advisory is.

nixpkgs integration

Two new config options are added to nixpkgs, ignoreAdvisories and ignoreAdvisoriesPredicate. A new environment variable is defined, NIXPKGS_IGNORE_ADVISORIES. Their semantic and implementation directly parallel the existing “insecure” package handling.

Similarly to broken, insecure and unfree packages, evaluating a package with an advisory fails evaluation. Ignoring an advisory that has been resolved results in a warning at evaluation time.

A new helper function lib.ignoreAdvisory is added, which simply adds an override to a derivation that removes the advisory. This can be used where more ad-hoc handling is desired.

Examples and Interactions

Package removal

There are two ways advisories interact with the removal of packages: Either they get an advisory because they are going to be removed, or they are removed because they have an unresolved advisory for a prolonged period of time.

  • Instead of removing a package directly, it should first get an advisory announcing the planned removal. This will allow users to migrate away beforehand. The advisory must have removal as category (This will facilitate automation in the future).
  • Before branch-off for a new release, all (leaf) packages with advisories that predate the previous branch-off are deemed safe for removal (unless stated otherwise). If a package is removed based on its advisory, that advisory message becomes part of the new throw alias.

Propagation across transitive dependencies

When a package that is depended on has an advisory, all packages that depend on it will fail to evaluate until that advisory is ignored or resolved. Sometimes, this is sufficient.

When the advisory requires actions on dependents however, it does not sufficiently inform about all packages that need action. Marking all dependents with that advisory is not a good idea either though: it would require users to go through some potentially long dependency chains. Instead, only applications, leaf packages or packages with very few dependents should get the advisory.

As an example, take gksu with the gksulibgksulibgladepython2 dependency chain (for the sake of the example, ignore that it also depends on EOL Gtk 2). Obviously, python2 should get an advisory. As a leaf/application, gksu should get one too (it could be the same, or with an adpated message). For the packages in between, it depends on whether they require individual action or not.

Backporting

New advisories generally should not be added to stable branches, and also not be backported to them, since this breaks evaluation.

Drawbacks

  • People have voiced strong negative opionions about the prospect of removing packages from nixpkgs at all, especially when they still technically work.
  • There is a slight long-term maintenance burden. It is expected to be similar to or slightly greater than the maintenance of our deprecation aliases.
  • Some of the example interactions are built on the premise that nixpkgs is under-maintained, and that most users are at least somewhat involved in the nixpkgs development process. At the time of writing this RFC this is most certainly true, but the effects on this in the future are unknown.

Alternatives

An alternative design would be to have advisories as a separate list (not part of the package), and have them numbered. Instead of allowing individual packages, one could ignore individual warnings (they’d need an identifying number for that). The advantage of doing this is that one could have one advisory and apply it for a lot of packages (e.g. “Python 2 is deprecated”). The main drawback there is that it is more complex.

A few sketches about how the declaration syntax might look like in different scenarios:

{
  # As proposed in the RFC
  meta = {
    advisory = "deprecation: Python 2 is EOL. #12345";
    advisoryDate = "2022-06-01";
  };
  # Advisories get an identifier in some nixpkgs-global table
  meta = {
    advisories = [ "1234-python-deprecation" ];
  }
  # Attempt to unify both approaches to allow both ad-hoc and cross-package advisories
  meta = {
    advsiroies = {
      "1234-python-deprecation" = "deprecation: Python 2 is deprecated #12345";
    };
  }
}

Unresolved questions

  • Do we want to support having multiple advisories on a package? (Or at least leave the option for it in the future)

Future work

  • Advisories are designed in a way that they supersede a lot of our “insecure”/“unfree”/“unsupported” packages infrastructure. There is a lot of code duplication between them. In theory, we could migrate some of these to make use advisories. At the very least, we hope that advisories are general enough so that no new similar features will have to be added in the future anymore.
  • Inspired by the automation of aliases, managing advisories can be helped by tooling as well. This is deemed out of scope of this RFC because only real world usage will tell which actions will be worthwhile automating, but it should definitely considered in the future.
    • There will likely be need for tooling that lists advisories on all nixpkgs packages, filtered by category or sorted chronologically.
    • Automatically removing packages based on time will likely require providing more information whether it is safe to do so or not.
7 Likes

Without commenting on the merits of the proposal itself, I strongly urge to use structured data wherever possible. With or without using identifiers that can be applied across packages, metadata fields should always express their meaning through structure, instead of encoding it in names or string contents. Example:

meta = {
  advisory = {
    deprecated = {
      reason = "…";
      expires = "2023-05-01";
    };
  };
}
3 Likes

Okay, let’s shed that bike then. I agree that my initial proposal on that bit is somewhat ad-hoc, but I’d say yours is on the other end of the spectrum (too contrived). People should be able to write these rather easily. How about making advisory an attrset, but otherwise keeping things the same:

meta = {
  advisory.message = "…";
  advisory.date = "2022-06-05";
}; 

This gives us the room to easily add new attributes should more information be required in the future. Maybe one thing we should already do right now is to make the advisory.category a separate field instead of a formatting convention.

If the advisories were a list, and we also added them for modules, maybe we could auto-generate most release notes, and move release notes closer to the code they change.

Maybe not all kinds of advisories need to be acknowledged/ignored for evaluation to continue?

This is an interesting idea, but how would this actually work? (I’d say this is generally out of scope of this RFC, but I’ll keep it in mind as future work)

I built it that way and not like you proposed here

only because I understood from your proposal one could have multiple advisories, or each of a different type. If there is only supposed to ever be one at a time, your variant is fine.

1 Like

Ah, I see. I think if we allow for multiple advisories, they could also have the same category (i.e. deprecation warnings for both Python 2 and Gtk 2), so a list might be better instead.

Bike shedding of the general name courtesy of @hexa:

  • Problems
  • Warnings
  • Issues

I’ll have to think about this some more, but both warnings and issues would be fine for me too

  • Getting the name right is important in the long run, because it needs to work for most use cases we can imagine/extrapolate.

    • Advisory is probably a term that should be reserved for security advisories and I would prefer if we would not overload it
  • Packages are generally affected by multiple issues, so we need a structured way to reflect that.

  • What about allowing users to ignore generic classes of issues? Then we’d need a common way to define similar issues.

    • Example User story: “I don’t care about packages without maintainers, since I won’t start maintain packages myself anyway. Please don’t warn me about those.”
  • Big fan of transitive deprecation warnings! People can’t be mad if we remove things, that have had critical problems for a long time, if they were informed of these problems early enough. Actually a big fan of expectation management in general!

  • Should provide a structured way to add the date, when the issue description was added. Parsing comments like it is done for aliases is quite a bit more fragile.

2 Likes

If we put the category into its own field, users can then provide a custom predicate to ignore entire classes of issues. Like for insecure packages, the main setting for ignoring is a predicate function and the list of packages that will be matched by name is just for convenience.