Packages marked as broken should come with an explanation

tysonzero · May 17, 2022, 1:58am

For non-trivial Haskell projects I pretty inevitably end up marking a whole bunch of packages as not broken, and as far as I can remember I have been bitten by bugs pretty equally often in packages not marked as broken as I have by packages marked as broken.

In some cases the marked-broken package doesn’t build at all, and I am less concerned with those situations.

It’d be really nice if broken was more of a Maybe Text than a Bool so that it always came with an explanation, so that I can know what to watch out for if I choose to override it.

In cases where for whatever reason an error message is too difficult to provide (automated tooling) then worst case just the string “broken” or whatever can be provided, but I assume a pretty sizable fraction of cases would be able to provide at least a little bit more than that.

Growpotkin · May 17, 2022, 3:18am

These get marked broken by automated tools. There’s literally thousands of them, they change every few hours, and frankly the PR review process is currently limited to a small number of people ( wish they would expand that ). Even if there were the man power to debug and report on the bazillion packages that break on a given day, PRs adding meta messages will sit in review forever. The time it takes to review them is time taken from merging PRs that fix broken packages.

It’d be nice to have detailed messages, but honestly, if you want details go you have to go get your hands dirty.

amjoseph · May 18, 2022, 12:39am

Yikes.

Do you have any insight as to why this is?

Recently I wanted to try out purescript (which is written in Haskell), and after much struggling am still unable to get nix to build it from source. To my great surprise, the purescript expression simply downloads binaries. I’ve encountered a disheartening level of breakage in haskellPackages which seems inconsistent with the extremely high level of quality throughout the rest of nixpkgs.

It’s been a while since I was writing Haskell code on a regular basis, but I see that cabal now has something analogous to Rust’s Cargo.lock. Are Haskellers simply not using this facility? Although I find that Cargo.lock makes rust projects more brittle, it is also the reason why rust programs always seem to build correctly on the first try (as long as you aren’t trying to modify them, of course!)

I see that haskellPackages appears to be attempting to curate a massive list of exactly one golden version for each package instead of the “nix way” of allowing multiple different versions of the same software to be installed simultaneously, and the similar “rust way” of simply abandoning any hope of ever being able to do that. I suppose rust gets away with doing that because it (essentially) does not support dynamic linking so you’re always doing worst-case amounts of rebuilding anyways.

ryantm · May 18, 2022, 1:38am

instead of the “nix way” of allowing multiple different versions of the same software to be installed simultaneously

This is the “nix way” but is not the “nixpkgs way”. We try to keep as few versions of software around as possible, because it is hard to reason about multiple versions and it leads to bloat on people’s systems: the same closure having two copies of some big library is not good if it could be avoided.

I don’t know much about this lock format or if the nixpkgs Haskell maintainers have done much with it yet, but it seems like this could potentially enable a buildHaskellPackage like buildRustPackage, and make maintaining the haskell package set less needed. I suspect some people would still want to keep it around though, because they probably consider it to be like an improved form of Stackage.

cdepillabout · May 18, 2022, 4:33am

This is quite a wide topic that would apply to all of Nixpkgs. I agree that it would be nice to have a reason why packages are marked as broken. If you look through the source code, sometimes people leave comments when marking things as broken, so that can be helpful.

Although a lot of the response in this thread have been Haskell-related, so let me try to reply to them.

I think you might be aware of this, but in Nixpkgs we have some automation around marking Haskell packages as broken that fail to build on Hydra.

Here’s an example PR where we do this:

github.com/NixOS/nixpkgs

haskellPackages: update stackage and hackage

NixOS:master ← NixOS:haskell-updates

opened 05:58AM - 14 May 22 UTC

expipiplus1

+683 -290

### This Merge This PR is the regular merge of the `haskell-updates` branch int…o `master`. This branch is being continually built and tested by hydra at https://hydra.nixos.org/jobset/nixpkgs/haskell-updates. You may be able to find an up-to-date Hydra build report at [cdepillabout/nix-haskell-updates-status](https://github.com/cdepillabout/nix-haskell-updates-status). We roughly aim to merge these `haskell-updates` PRs at least once every two weeks. See the @NixOS/haskell [team calendar](https://cloud.maralorn.de/apps/calendar/p/H6migHmKX7xHoTFa) for who is currently in charge of this branch. ### haskellPackages Workflow Summary Our workflow is currently described in [`pkgs/development/haskell-modules/HACKING.md`](https://github.com/NixOS/nixpkgs/blob/haskell-updates/pkgs/development/haskell-modules/HACKING.md). The short version is this: * We regularly update the Stackage and Hackage pins on `haskell-updates` (normally at the beginning of a merge window). * The community fixes builds of Haskell packages on that branch. * We aim at at least one merge of `haskell-updates` into `master` every two weeks. * We only do the merge if the [`mergeable`](https://hydra.nixos.org/job/nixpkgs/haskell-updates/mergeable) job is succeeding on hydra. * If a [`maintained`](https://hydra.nixos.org/job/nixpkgs/haskell-updates/maintained) package is still broken at the time of merge, we will only merge if the maintainer has been pinged 7 days in advance. (If you care about a Haskell package, become a maintainer!) --- This is the follow-up to #172337. Come to [#haskell:nixos.org](https://matrix.to/#/#haskell:nixos.org) if you have any questions.

Here’s an explanation for how the @NixOS/haskell team operates in Nixpkgs:

github.com

NixOS/nixpkgs/blob/52dc75a4fee3fdbcb792cb6fba009876b912bfe0/pkgs/development/haskell-modules/HACKING.md


# Maintainer Workflow

The goal of the [@NixOS/haskell](https://github.com/orgs/NixOS/teams/haskell)
team is to keep the Haskell packages in Nixpkgs up-to-date, while making sure
there are no Haskell-related evaluation errors or build errors that get into
the Nixpkgs `master` branch.

We do this by periodically merging an updated set of Haskell packages on the
`haskell-updates` branch into the `master` branch.  Each member of the team
takes a two week period where they are in charge of merging the
`haskell-updates` branch into `master`.  This is the documentation for this
workflow.

The workflow generally proceeds in three main steps:

1. create the initial `haskell-updates` PR, and update Stackage and Hackage snapshots
1. wait for contributors to fix newly broken Haskell packages
1. merge `haskell-updates` into `master`

This file has been truncated. show original

What we unfortunately don’t have is any sort of automation around marking packages as unbroken once they start building again (for instance when a new version is released to Hackage that fixes the build). No one has come up with a good way to do this, and we’d love help if someone could automate this.

This is quite concerning! This hasn’t been my experience using the Haskell stuff from Nixpkgs, but if you’re running into bugs in packages not marked as broken, we’d love to get bug reports!

Also, sometimes it seems like users aren’t aware, but the only Haskell packages that are correctly marked as broken are the ones in haskellPackages. We provide Haskell package sets based on other GHC versions, but the Haskell packages in them don’t come with any guarantees. For instance, haskellPackages.lens is more-or-less guaranteed to build and work as long as it is not marked as broken (and if it is marked as broken, then it may or may not be broken). But something like haskell.packages.ghc8107.lens may not compile, even though it is not marked as broken.

We currently don’t have any infrastructure to correctly mark packages as broken from Haskell package sets based on the different compilers. (Although this would of course be a nice improvement if we could do it.)

I’m not sure if you’re talking about haskellPackages, or all of Nixpkgs, but I’m assuming you’re talking about haskellPackages (if not, please ignore this).

I don’t feel like this is the best characterization of how we maintain haskellPackages. Haskell packages do get marked broken by automated tools (although it wouldn’t be crazy to imagine the automated tools were made smarter and were able to parse the Cabal build log to try to figure out the reason for a build failure). And there are literally thousands of broken packages from Hackage. We currently have about 6,000 to 7,000 working Haskell packages, out of like 16,000 on Hackage.

But the Haskell packages don’t change every few hours. They are updated and marked broken in PRs like haskellPackages: update stackage and hackage by expipiplus1 · Pull Request #172982 · NixOS/nixpkgs · GitHub, which happen about once a week or once every other week. I’d say on average maybe about 10 to 30 packages are marked broken every round, but that can really vary depending on what’s going on.

I do definitely want to disagree with this, though! The PR review process is quite explicitly open to absolutely everyone. All you need is a GitHub account to participate in reviewing Haskell package PRs. The Haskell maintainers in Nixpkgs would absolutely love more help with reviewing PRs.

In Nixpkgs, we do have a pretty good response rate to issues/PRs tagged with haskell. It is pretty rare to send a Haskell-related PR to Nixpkgs and never get any activity on it:

If you scroll through this list, it is rare to find an issue/PR without any comments (at least in the last year or so).

cdepillabout · May 18, 2022, 4:48am

You picked one of the worst packages to try to get building from source

haskellPackages is generally maintained as one large set of coherent packages that all compile together. Like stackage, but with more packages. Right now in nixpkgs master, haskellPackages is based on Stackage LTS-19. So you get all the package versions from LTS-19, plus the latest of each package on Hackage.

The problem with PureScript is that it is currently based on LTS-18:

github.com

purescript/purescript/blob/211e67d4e7d186682ea70e8740055ad4e6624671/stack.yaml#L3


      
          # Please update Haskell image versions under .github/workflows/ci.yml together to use the same GHC version
          # (or the CI build will fail)
          resolver: lts-18.15
          pvp-bounds: both
          packages:
          - '.'
          ghc-options:
            # Build with advanced optimizations enabled by default
            "$locals": -O2 -Werror
          extra-deps:
          # As of 2021-11-08, the latest release of `language-javascript` is 0.7.1.0,
          # but it has a problem with parsing the `async` keyword.  It doesn't allow
          # `async` to be used as an object key:

You can see how this would cause problems with getting it to build. The PureScript maintainers generally do not try to keep up with the latest LTS, which makes it hard to get PureScript building from source in Nixpkgs.

In the past couple years at least, I’ve been the only one to get PureScript building from source. If you look through Hydra, you can find some times when haskellPackages.purescript actually builds. Although it inevitably breaks again as stackage and hackage move on.

As you and @ryantm suggest, some sort of buildHaskellPackage function would be really nice. It would be nice to be able to build a Haskell package given a stack.yaml or stack.yaml.lock as input. There have been old projects attempting this, but nothing recent doing this (other than haskell.nix).

I’m somewhat surprised to hear this. Aside from packages like PureScript that don’t keep up with Hackage/stackage, we have quite a lot of Hackage building in Nixpkgs! I think we’re doing a pretty good job, especially considering there are only four of us in the @NixOS/haskell team. We of course always welcome PRs getting more packages building!

fgaz · May 18, 2022, 8:31am

I’m working on a project that does exactly that: cabal-nix (temporary name, suggestions welcome). Like the rust builder, it’s based on fixed-output-derivations and it builds the entire dependency tree at once, so it has a different target than haskell.nix.

It’s still very experimental, and using it for obtaining a dev environment is not great (working on that), but it works well enough for deployment. I’ll properly announce it when it’s ready, but I figured I’d mention it since it’s relevant to the discussion

fricklerhandwerk · May 18, 2022, 8:33am

There is PR to do something like that: Stdenv overhaul `broken` flag by teto · Pull Request #140325 · NixOS/nixpkgs · GitHub

sternenseemann · May 18, 2022, 10:22am

Note also that there’s an issue about this. I’m planning on tackling this after the next cabal2nix release is finished.

The comparison between Cargo.lock and cabal freeze files is inaccurate. Cabal requires what you can call “local consistency”, meaning that exactly one version of a package may be available at build time. This has the practical implication that a “globally consistent” package distribution is possible, i.e. Stackage and nixpkgs’ haskellPackages. A globally consistent distribution has the following practical advantages:

Having the only versions of a package available explicitly packaged and not guarded by a lock file allows you to effectively react to security advisories and to patch critical bugs. This point has been made well in “The modern packager’s security nightmare”. The point kind of makes itself looking at all the crates with advisories we use because of lock files.
I strongly believe that our current approach yields a higher quality: We are running tests on every package where it is possible (something buildRustPackage does not), having greater confidence it’ll work correctly at run time, and have a lot of code working around specific problems. A lot of them are self afflicted of course (version incompatibilities, overzealous constraints), but a significant portion are workarounds for compiler bugs, fixes for a package’s interaction with the unusual environment that is Nix/NixOS. Additionally, these are benefits inherited by downstream users for free where in a buildRustPackage approach you’d likely have to add the same workaround in every individual derivation.
Most importantly of all, this would certainly be unfeasible in practice: Building some Rust derivations is already painfully slow, but GHC performs even worse than rustc. Rebuilding the same dependencies in different derivations all over again (as happens with a buildRustPackage approach) would waste CPU time and make rebuilding e.g. pandoc a painful ordeal. Providing binary cache for developers’ downstream projects would also not be possible which has been a killer feature of the nixpkgs Haskell infrastructure from day one (in fact it got me and many other into Nix in the first place).
Related to that, we can do expensive build configurations for downstream benefit, e.g. build profiling information, so no one has to rebuild their entire dependency closure to profile an executable.

I wouldn’t be opposed to a solution that allows to use a stack lock file / cabal freeze file for individual packages that are difficult to build with modern Haskell libraries, but it wouldn’t be a sustainable large scale solution in my estimate.

I fail to see how just wrapping whatever cargo does in a derivation (using cargo vendor and cargo build) is the Nix way to be honest. In fact, in this approach the derivation abstraction provided by Nix is arguably under-utilized.

I’d be curious what exactly the “disheartening level of breakage” you describe is. The quality is lower than the rest of nixpkgs sure, since derivations are added automatically via code generation of which some can’t be expected to work (and are marked as broken). Such derivations would of course never be added if they were maintained manually, as the rest of nixpkgs is.

From my perspective (one of the Haskell maintainers, that is) the situation is as follows: We have 4 core Haskell maintainers and I guess ~20 other occasional contributors, all of them volunteers. Together we maintain a consistent (“blessed”) Haskell package set that has over 6000 packages working, tested over 3 platforms (aarch64-linux, x86_64-darwin and x86_64-linux) and working to a high degree on a fourth (aarch64-darwin). We were able to go as far as over 7000 before recent ecosystem changes. Hackage provides 16000 packages in total (many if not most of them we probably can’t hope to get to work), less than half of our working packages are part of Stackage LTS. I would conclude: Not bad, considering that we can’t do arbitrary version constraint solving.

Of course, we necessarily are a kind of opinionated Haskell distribution (much like Stackage), so if you are trying to get a differing approach to work with it, the experience won’t be great. Then again, the primary objective has always been packaging Haskell software for NixOS / the nixpkgs software distribution which has generally worked well so far.

Sorry for going on this huge tangent, I realize the thread is actually about something else. As for marking things broken, a big problem relating to Haskell I’ve observed is that it is often hard to tell what in a NixOS configuration caused a broken eval failure without the use of --show-trace. We caused this situation accidentally by improving the accuracy of the broken report and reducing the amount of packages that were marked as broken by accident. Not sure how exactly this could be improved, I think ideally we could add more tracing abilities to the module system.

dschrempf · May 18, 2022, 1:23pm

I would like to add that many Haskell packages are “broken” because of circumstances and constraints the Nixpkgs team has no infuence on (most often because of too strict version bounds).

Many times, now broken packages are actually working on older Nixpkgs channels or revisions.

Sometimes, it is enough to use another package set. For example, haskell.packages.ghc8107.

Finally, I think a lot of these issues would be handled best by providing good documentation. I would really appreciate if we could get a proper Nixpkgs Haskell manual up and running, so that we can all understand and contribute more easily.

FRidh · May 18, 2022, 2:39pm

sternenseemann:

The comparison between Cargo.lock and cabal freeze files is inaccurate. Cabal requires what you can call “local consistency”, meaning that exactly one version of a package may be available at build time. This has the practical implication that a “globally consistent” package distribution is possible, i.e. Stackage and nixpkgs’ haskellPackages . A globally consistent distribution has the following practical advantages:

Having the only versions of a package available explicitly packaged and not guarded by a lock file allows you to effectively react to security advisories and to patch critical bugs. This point has been made well in “The modern packager’s security nightmare”. The point kind of makes itself looking at all the crates with advisories we use because of lock files .

I strongly believe that our current approach yields a higher quality: We are running tests on every package where it is possible (something buildRustPackage does not), having greater confidence it’ll work correctly at run time, and have a lot of code working around specific problems. A lot of them are self afflicted of course (version incompatibilities, overzealous constraints), but a significant portion are workarounds for compiler bugs, fixes for a package’s interaction with the unusual environment that is Nix/NixOS. Additionally, these are benefits inherited by downstream users for free where in a buildRustPackage approach you’d likely have to add the same workaround in every individual derivation.

Most importantly of all, this would certainly be unfeasible in practice: Building some Rust derivations is already painfully slow, but GHC performs even worse than rustc . Rebuilding the same dependencies in different derivations all over again (as happens with a buildRustPackage approach) would waste CPU time and make rebuilding e.g. pandoc a painful ordeal. Providing binary cache for developers’ downstream projects would also not be possible which has been a killer feature of the nixpkgs Haskell infrastructure from day one (in fact it got me and many other into Nix in the first place).

Related to that, we can do expensive build configurations for downstream benefit, e.g. build profiling information, so no one has to rebuild their entire dependency closure to profile an executable.

offtopic:

You describe exactly also the reason why I would like us to keep providing consistent package sets as a distribution, even though it can be a pain to form such a package set for some languages (hello Python). It would be great if other languages/frameworks would also have such upstream distributions that we could piggyback on.

SergeK · May 18, 2022, 3:59pm

(Yes please! Of any mass changes affecting interfaces that I can think of, this is probably most wanted)

tysonzero · May 18, 2022, 7:51pm

To clarify the rate of (runtime) bugs in general is quite low and well within expectations, this was more to point out how not-broken “broken” packages are, and wasn’t meant to be a criticism of “non-broken” packages.

amjoseph · May 18, 2022, 11:41pm

I’d be curious what exactly the “disheartening level of breakage” you describe is.

I probably phrased that poorly.

Twice I encountered a situation where a package, written in Haskell, had a top-level expression which silently downloaded binaries to my machine in a situation where there was no reason for me to expect that it would do this. I would be a lot less sensitive to this kind of thing if the PR implementing RFC 89 wasn’t stuck in limbo. It is a real sore point for me that I have no way to block nixpkgs from downloading binaries, nor to submit PRs that designate such packages when I discover them.

In both cases the corresponding haskellPackages.* expression would not build, so I assume the binary-downloading was due to that being the only way to get people working software. Maybe I should not have jumped to that conclusion.

The first situation where I encountered this was purescript, which I’ve already mentioned. I tried to manually fix the haskellPackages.purescript expression, but hit a lot of problems with bower-json and old versions of aeson. So then I gave up trying to use nixpkgs to build purescript and decided to use cabal directly, but starting with using nixpkgs to build cabal. It turned out that haskellPackages.cabal is marked broken, and <nixpkgs>.cabal does the downloading-binaries (of GHC!) thing. At that point I began to wonder how deep the rabbit-hole went…

Cabal requires what you can call “local consistency”,

If a package manager only permits installing or depending on one version of each of package then a software distribution for that package manager should bless one version of each package. The blessed version for each package must be compatible with the blessed version every other package.

Well, that “if clause” seems extremely inapplicable to nixpkgs…

amjoseph:

the “nix way”

I fail to see how just wrapping whatever cargo does in a derivation (using cargo vendor and cargo build) is the Nix way to be honest.

It isn’t, which is why I prefer carnix/crate2nix.

I see lockfiles as a starting point for generating an “guaranteed to build” nix expression that allows overrides. I consider the current importCargoLock to be an interim solution at best.

Having the only versions of a package available explicitly packaged and not guarded by a lock file allows you to effectively react to security advisories and to patch critical bugs.

Overrides are then applied to that expression in order to react to security issues that upstream decides to ignore. This should be possible unless the package depends on an old version of a library with a security problem and the fix hasn’t been backported; in this very specific situation the right response is to mark it broken.

Of course, this requires one IFD-rule exception for each language, so we can use carnix/crate2nix-style lang2nix tools to produce overridable nix expressions from lockfile s. @sternenseemann, I see that you are one of the shepherds for RFC 109 which makes it possible to move away from interim solutions like importCargoLock and towards tools like carnix/cargo2nix/lang2nix that turn a lockfile into an overridable derivation.

Perhaps my gripe here has more to do with Cabal than nixpkgs. I think it’s a bit weird that a programming language has a Long Term Support lassification and timeline. I depend on packages like tex, runit, and djbdns that were written decades ago and still work beautifully. Imagine if the C programming language simply “expired” all the code written more than a few years ago and declared it “beyond LTS!”

abathur · May 19, 2022, 2:15am

A minor tangent… If this existed, I think the most valuable version could be platform-variant and able to express the difference between things that are broken for a lack of effort/time (please try to fix me!) VS things that are intrinsically broken (I am broken and cannot be fixed until some pr/rfc/etc. lands; save yourself!)

amjoseph · May 19, 2022, 8:07am

This is already possible; meta = { broken = stdenv.isDarwin; } appears in many places in nixpkgs.

I think a simple string gets the job done here, no?

It would be a pretty simple RFC to propose that in pkgs/stdenv/generic/check-meta.nix the metaTypes.broken attribute be changed from bool to either bool str, where a pkg is considered to be broken if pkg.meta.broken or false != false.

I count only 25 occurrences of .broken in nixpkgs which are not assignments to the broken attribute (i.e. are not foo.broken = true):

$ rg -g \*.nix '\.[ ]*broken' | grep -v '\.broken =' | wc -l
25

(Technically there are contorted ways (let b = "broken"; in pkg.${b}) to access the broken field that wouldn’t match this regexp). The PR to implement the RFC would be one line needing careful attention and 25 lines of straightforward/LGTM updating.

Hopefully somebody will submit an RFC for this after the release and people have had time to decompress. I can do it if nobody else wants to.

abathur · May 19, 2022, 1:54pm

How would you express that a package is currently broken on Linux and Darwin x86-64 because no one’s had success fixing it, and is broken on darwin-aarch64 because the upstream hasn’t addressed compatibility issues yet?

dschrempf · May 19, 2022, 2:50pm

I don’t understand the either bool str part. Shouldn’t it be like (Haskell syntax) data Broken = Fine | BrokenBecause String?

What do you mean with false != false?

Couldn’t we say:

meta.broken = if isDarwin
                        then BrokenBecause "ReasonA"
                        else if isLinux
                                then BrokenBecause "ReasonB" ...

NobbZ · May 19, 2022, 3:40pm

They meant (pkg.meta.broken or false) != false or if pkg.meta ? "broken" then pkg.meta.broken else false for brokennes need to not be equal to false. Any string is “not equal to false”.

maralorn · May 19, 2022, 7:33pm

That rabbit hole is not as deep as one might think. The package you want is pkgs.cabal-install which is basically the same as pkgs.haskellPackages.cabal-install. We follow the naming of the packages on hackage.

But yeah, we use a binary ghc for bootstraping because at this point it would require about a day to compile the chain from gcc to a current ghc.

just for the record, the haskell.nix project makes that approach (lockfile → one derivation per dependency with correct version) possible. Sadly it is incompatible with all Haskell packages in nixpkgs. I would love to have this functionality in nixpkgs.

Let’s disentagle that a bit.
stackage (independent of cabal-install the tool and Cabal the library) provides an LTS set of none to build together packages. Just because packages are not in there they are not “expired”. It’s just that no one took on the task to include them into stackage. Even very important projects like haskell-language-server are not on stackage, so being “beyond LTS” (i.e. not on stackage) is not in any way a form of dismissal. stackage is an addition to the ecosystem making life for maintainers and developers easier.

cabal-install or ghc do not know anything about stackage and will happily build anything no matter how old it is.