Stats/trends on github issues&prs

jonringer · July 7, 2020, 12:52am

One of the things that really drew me towards the nix community was the lack of gate keeping.

The only real [non-technical] barrier is creating a github account, which is largely ubiquitous anyway.

7c6f434c · July 7, 2020, 6:57am

Maybe the mostly-organic volunteer-based nixpkgs process doesn’t scale with the ease of github contributions.

… and a lot of small, easy-to-agree-to-try process improvement are made more expensive by the fact 1) GitHub tracking lacks what is table stakes in dedicated issue tracking, and 2) external tools need to integrate with proprietary-implementation-only, semi-transparently-rate-limited, sometimes-breaking (without documenting the changes) APIs.

This sounds like «making easy things very easy, and medium things hard», and the problem of Nix* processes not scaling is partially caused by the problem of GitHub UX model not scaling to some of Nixpkgs parameters.

raboof · July 7, 2020, 8:26am

Wow that is cool!

For me as a casual contributor (15 PR’s merged, 10 abandoned, 6 still open, 1 draft - mostly simple stuff) github seems to work really well. Indeed there are many interesting ideas for improved tooling, but I’m not sure switching to another tracker would really make so much of a difference.

We do this in another project I’m involved in, and TBQH I think it’s rather tedious and not really worth it: it adds work (deciding what belongs where, which can be hard) and I’m not sure it has a benefit. Many ‘potential bugs’ are not high-priority or actionable, so I’m not convinced you actually achieve more ‘cleanliness’ this way.

Such a barrier would make me very sad - the open model is one of the main things that attracted me to nix.

There are some great ideas for useful tooling here. I’ll resist adding my own until I actually have time to help make them happen, though

danieldk · July 7, 2020, 8:48am

Please, let’s not do this. This does not add to a qualitative discussion.

musicmatze · July 7, 2020, 12:52pm

Casual contributors do not feel the pain of maintainers, of course. The problem is that it does not scale for the community at large, or maintainers - and that’s my point. Fire-and-forget PRs are not the problem here.

ryantm · July 7, 2020, 1:32pm

This whole discussion seems predicated on the assumption that the stats are bad. Has anyone compared them to a comparable project with only about 1000 maintainers?

If anything, the stats tell me our tooling is comparatively amazing. We are getting more done with less people than anyone else.

I’m basing this mostly on comparing us with the AUR which Repololgy says has 9000 maintainers.

Thra11 · July 7, 2020, 1:39pm

Echoing what @zimbatm said, I feel like we need better tools to match issues and PRs to interested parties in a more dynamic / casual manner.

Sometimes, I have some spare time, so I think, “maybe I’ll take a look at some nixpkgs issues/PRs”. I sit down, open up the nixpkgs issues or PRs page on github, and start reading the titles. There are over 4000 open issues / 2000 open PRs, so I know there are plenty that I can contribute to in there somewhere. However, after scanning the first 10 or so pages, I’ve seen a lot of things I am not interested in or know nothing about (Sometimes I run into a massive block of automated issues or PRs created by a bot).

Sure, I could try to review a lot of these things, but if I could find the issues/PRs that:

Relate to packages I use or would like to use.
Relate to languages and frameworks that I am familiar with.
Affect platforms which I have.

then I feel like a) my attention is more likely to help the issue/PR and b) I’m more likely to benefit. After a few pages of items which aren’t interesting to me, I usually give up. As a result, my main way of finding issues and PRs at the moment is to wait until I have a problem or want something packaged, then check if there are any related issues or PRs. An then once the issue is fixed, I forget about it, whereas I’d often be happy to look at similar issues in the future.

In an ideal world, I would like to be able to “match with” issues/PRs which interest me in two ways:

Searching/Browsing: I want to be able to find interesting issues/PRs easily when I go looking for something to help with.
Subscription: I want to be able to receive some form of low priority notification if a relevant issue/PR is opened. Maintainership isn’t really appropriate for this, as it sets the bar far too high. If I add myself as a package maintainer, it implies a certain level of interest and commitment: it would be noisy and disruptive if I added myself as a package maintainer, then removed myself again when I realised I was no longer interested or was “subscribed” to too many packages.

Things I want to filter/match:

Packages and modules
Topics: I can search for topic labels, but I can’t see any way to subscribe to one
Maybe even just combinations of keywords

I realise that this is a complex problem. Some ideas for things (of varying difficulty) that might help:

The ability to match issues and PRs to packages. ofborg already works out what packages are affected by PRs. What if instead of (or as well as) asking reporters to cc maintainers and git blamed committers, we asked them to specify package and module names in a machine-readable way?
Some sort of script/program which looks at the packages installed on my system and searches github for issues and PRs relating to them (or creates a subscription).
Encourage triaging and labelling issues. For example, it should be fairly straightforward for a beginner to look at a package-request issue, find out what language/framework the package uses and add the appropriate labels.
Some documentation on how to search GitHub effectively (e.g. I’ve worked out that you can write NOT something in your query to exclude certain strings: presumably there are other things that might help narrow it down to interesting items). Can you subscribe to a search query on GitHub? I don’t think you can, but it would be really helpful if you could.

If they’re possible, I think some of these things would be really helpful for beginners or casual contributors. The ability to just subscribe to a package or topic wouldn’t be a big commitment (you can always unsubscribe, and you don’t feel like you have to respond to every issue/PR), and would help to get the right eyes looking at the right issues/PRs.

jonringer · July 7, 2020, 4:23pm

For certain languages, ofborg will add a label related to the programming language. This isn’t foolproof as it will just check the directory and see if corresponds with a known language, so applications packaged outside of the directory don’t get labeled.

This also applies to committers as well, we are not interested-in all of the PRs we review, but if it’s important enough for someone to take the time to package it, then it’s important to them.

bbigras · July 7, 2020, 6:44pm

Before the last release I really wished there was a way to list all the outdated packages that I’m using so that I can help update them before the freeze.

jonringer · July 7, 2020, 8:28pm

You can go to Projects list - Repology and find the package.

This has two major shortcomings though:

Another repository has to have the same package with exactly the same package name
The other repo has to be more up-to-date for it to show as outdated (if the package is also registered in a repository like CPAN or hackage, this is great, but many packages are hosted through github, pypi, or other repositories which can’t be queried in a similar fashion.)

related tools:

GitHub - Mic92/nix-update: Swiss-knife for updating nix packages. allows or easy update of src
- supports github, gitlab, pypi, and a few others

EDIT:
I remembered another feature of repology, which is looking yourself up as a maintainer, and then you can navigate to out-of-date packages you maintain. This of course, requires you to be a maintainer for the given package though.

bbigras · July 7, 2020, 11:10pm

Thanks for the reply.

I did use repology that way before the last freeze and I did update a couple of them. I don’t know if nix-update was a thing then but I definitely use it all the time.

If there was a tool that would go over all the packages that I use and publish those somewhere, I would be more than fine with it. It could then be used for all sorts of things; package popularity, notify of security issues, and let me know when a package that I use has an issue.

I was kinda referencing that comment from @zimbatm.

I too wouldn’t mind receiving notifications that one of the many packages I use needs a review or if it’s auto-update PR thing is blocked.

I’m not only interested in the packages I use but I would probably prioritize them.

SRGOM · July 8, 2020, 3:24am

Something else that came to my mind- how would it be if there is a bot that does this-

check if PR is in format- [channel] <string>: <date or ver or num> -> <date or ver or num>
check if exactly two lines have been changed, ver and hash.
Version should be such that it exists on GitHub master repo, not a fork (since GitHub allows you to refer to a fork with masters url, a potential security vulnerability).
if so , launch ofborg and run <executable> -- help .

Auto-commit or label (ready for commit).

What does this add over r-ryantm? Very little right now. I went fo the GitHub pulls and maybe 96% such are from r-ryantm. The remaining 4% comes from the fact that repology version is not latest (I think thats what ryantm told me).

But there are obviously times and packages where repology is not at latest version, I’d argue that r-ryantm is not the latest version but aur latest versio.

Also a quick turnaround might also make contributors more likely to submit, which imho is a bigger factor, probably why AUR succeeds at usually being the fastest.

It’s possible to even extend this later based on how successful this is, in safe ways. E.g. only changing buildinputs or nativebuildinputs sounds harmless. Same for probably cargohash.

I can code this if this proposal makes sense.

schmittlauch · July 9, 2020, 10:02am

That might or might not be the case.
But either GitHub has had some platform issues in the last 24 hours, or nixpkgs now has so many issues and PRs that searching them does not work anymore

Example: I always get a unicorn timeout at Issues · NixOS/nixpkgs · GitHub

Edit: FYI, I created a support topic at GitHub’s forum about that community · Discussions · GitHub

timokau · July 9, 2020, 9:22pm

The auto-merge would likely be controversial, related [RFC 0050] Merge bot for maintainers by FRidh · Pull Request #50 · NixOS/rfcs · GitHub. I’m not sure how useful the labeling is, as you yourself said the majority of such trivial PRs are bot PRs anyway. One downside would be that prioritizing such a label would create perverse incentives, making people try to please the labeling algorithm and avoiding refactorings.

timokau · July 9, 2020, 9:25pm

As @doronbehar mentioned, I am working on a bot that is supposed to help with some of the issues covered here. In particular the discoverability of actionable PRs and the distribution of the review & triage effort. Its still in development, but already running on nixpkgs and ready for testing.

More info here.

milahu · February 6, 2023, 2:56pm

i made a start in GitHub - milahu/nixpkgs-watcher: notifications for my favorite apps

spoiler: no code yet, only ideas

ping @Thra11

lets collaborate : )