Nixpkgs is impressive. But how is that gap accounted for, anyways?
Iâd be skeptical of any answers that arenât backed up by an analysis of whatâs behind in nixpkgs, and whether the set of laggards is fairly stable or rapidly changing.
With that big fat asterisk, a few thoughts:
-
AFAIK, repology learns about new package versions when it first discovers that at least one repo has updated to it. Many of the repos that are the least out-of-date, per repology, are smaller language/ecosystem package manager repos that are more or less the canonical/reference repo for their corner of the packaging universe. I havenât tried to analyze, but I assume updates to these packages almost always appear in these repos first (and may be updated by a developer or automatically updated?)
A general package repository is going to have a hard time keeping up with repos that are canonical or near-canonical for most of the packages they contain. We wonât ever be the first to know about these unless we set up infrastructure to independently monitor source repositories for all of their packages.
-
Iâm not sure it would be meaningful to try to beat/tie those repos on individual updates, because I think itâs normal or normal-ish for us to batch update these (as in rPackages: CRAN and BioC update ¡ NixOS/nixpkgs@7ed992a ¡ GitHub) on some interval. For example, this search shows that the CRAN update appears be included in a broader r package update every 6 weeks: Search ¡ repo:NixOS/nixpkgs "rPackages: CRAN and BioC update" ¡ GitHub.
Iâm not intimately involved in any of these, but I imagine they happen this way due to limitations on human time/energy.
-
AFAIK, something like r-ryantm is driven by what repology knows is the latest update. For the packages the bot is able to update, thereâll be some amount of lag time between when repology notices, r-ryantm notices, the PR can make it through review, the package update can propagate back to unstable, and repology can in turn notice that weâre up to date.
-
Even if r-ryantm knows about an update via repology, it isnât going to open a PR if the package build fails for some reason. If the maintainer or someone else in the community doesnât notice and raise a flag or try to fix it, itâll go untouched until someone does.
If you really want to make the line go up, Iâd guess the 3 most-meaningful places to look are:
-
figuring out whether additional resources can shrink the intervals at which parts of nixpkgs are bulk-updated from some other source safely without creating some other bottleneck (reviewing, ofborg, hydra, etc.)
-
developing some automated tooling to recognize and surface packages that repology knows weâre out of date on that r-ryantm or any other update bot are unable to try to update and fix any that are easy
-
developing some automated tooling to recognize and surface packages that r-ryantm is failing to update and see if nudging the maintainer or surfacing that information for users is sufficient to get it fixed
(Iâm not sure itâd be a good use of time to send people who are unfamiliar with a given package out into the world to try and fix all of these just to make the line go up. People being interested enough in the package to ask/report/PR seems like a good filterâŚ)
Probably because this way of âmeasuringâ freshness assigns equal weight to binutils
and left-pad
.
The graph charts the number of total packages and fresh packages for each repo. Then the angle from the origin to the point represents the percentage of fresh packages in that repo. Hypothetically, if a repo lies on the line then all its packages are fresh, and if a repo lies on the x axis then none of its packages are fresh.
So the question posed above is: Why are less than 100% of packages in nixpkgs fresh?
There are some other interesting things to note about this chart:
- Repos that are their own source of truth will always lie on the line, e.g. CRAN, Hackage, CPAN, Ruby Gems, crates.io, etc, because what is a fresh package is defined by it being published in that repo.
- The absolute number of fresh packages in nixpkgs is still 2-3x most other repos
I agree with abathur that this really needs a detailed analysis of which packages are behind before making any conclusions, including the conclusion that this is something that needs to be âfixedâ.
It looks like repology exposes a week of database dumps at https://dumps.repology.org.
Given that they list when each new version appears in a packageâs history tab and when each repo caught up, I assume that the database contains sufficient information to help answer some basic questionsâŚ
- It may not have a sense of what repo is ~canonical, but I imagine itâs possible to cluster packages that always appear first in a given repository
- get a sense of the typical update interval for every package in unstable
- ballpark what impact halving the interval of a given bulk update wouldâve had on freshness over the last year
- figure out what was out of date on every day of the year and whether itâs a constantly-churning pot of 2-week update cycles or whether thereâs a big core of packages that are almost always out of date
- maybe try to tease out sets that are out of date because they release extremely often (much effort to keep up) vs those that went 4 years without a release and suddenly sprung to life (anyone who cares may no longer be watching carefully for updates)?
It would be cool if we could get nixpkgs master onto Repology. Iâm not sure if theyâd be amenable to it since it isnât really a recommended way to consume packages.
Channels are usually within 2â3 days behind master, so I donât expect it would make such a big difference for the numbers.
Repology doesnât use Nixpkgsâ Nix expressions directly but the packages.json we generate for the channels.
Master wouldnât really work out.
It doesnât necessarily, but humans can be impatient, and a distribution like the Arch/pacman db often strives for day 1 updates, which our current infra sort of rules out by default.
Of course, from a point of reason I could say that this really doesnât matter, since its typically only a few days behind anyway; but humans are not always reasonable either
Thatâs not to say that there arenât real ways that we could probably improve this so called âfreshnessâ metric. Systems like Debian & Arch, which I most familiar with from previous experience, typically have a clean split between what are deamed âofficially supportedâ packages, and those that are community maintained. We somewhat have this split in nixpkgs too, we have the nixpkgs repo itself, and there are plenty of other repositories on GitHub containing Nix code, take NUR, or the collection of flakes out in the wild.
Some of those packages even have their own repo specific packaging, and a more stable version in nixpkgs. But I digress, the point is that our version of this official/unofficial split is more ad hoc and a lot of packages that arenât, and maybe have never been well maintened end up in nixpkgs anyway, and conversely, there are some good nix packages out in the wild that donât exist in nixpkgs. I believe, to some extent, that this clear boundary in other projects makes it easier to focus on freshness where it really counts, i.e. âofficially supportedâ packaging.
Of course, there is also the simple fact that packages that more individuals find useful will typically just naturally recieve more attention at the packaging layer, and so its for packages that donât really have this natural advantage that could likely use the most help, at least if they are to be officially supported in nixpkgs.
Iâm not saying I know exactly what we should do about it at this point, but we could maybe start by trying to make that official/unofficial distinction clearer than it currently may be, even if it means removing some poorly maintained packages for now.
I wondered the other day how many outdated packages my system has compared to the proverbial 89% freshness and wrote a nix-olde hack that measures just that. For packages in my system it reports 325 of 1425 (22.81%) installed packages are outdated according to https://repology.org.
. Which is a lot worse than 89% fresh.
A bulk of these stale packages are:
-
xorg.*
, which does not have an auto-updater -
python:*
packages (trail something?) -
perl:*
packages (trail something?) -
haskell:*
packages (trail stackage? expected to lag)
Random factoid:
- when ran
master
tool reports:372 of 1432 (25.98%) installed packages are outdated according to https://repology.org.
- when ran
staging
tool reports:325 of 1425 (22.81%) installed packages are outdated according to https://repology.org.
staging
â master
â channel
does not make too big of a difference WRT outdated packages at least on my system. Many of packages are just stuck on old versions and need manual intervention.
Nah. It is not a general recommendation to stick to master channel. The unstable channels provide the rolling-release-nixos enough.
For me a bigger concern is that there are bugs which block some package updates for pretty long. In other words, a small group of packages is really outdated.
I think the community should schedule a NixOS release where the focus is mostly on removing cruft, refactoring and cleaning up. Apple did this at some stage (Snow Leopard) and it was very successful. Intel is also refactoring architectures every other release cycle. Itâs good to keep technical debt under control.
However, in general, I find channels to be fresh and Nix quite pleasant to use.