Darwin, again …

There’s still people offering to join darwin-maintainers. Unpinning the issue will slow that down, of course. Maybe there’s other ways to keep putting the word out?

It’s a tall task to keep up with the contributions of thousands of Linux-based contributors though.

@reckenrode is also spearheading an effort to thoroughly revise the Darwin stdenv, decoupling it from the bootstrap-tools. This should make some things a lot easier in the future, like LLVM bumps. Unfortunately the number of people with the confidence to review changes that touch stdenv is small. I’m trying to work closely with them, reviewing PRs to keep this moving forward.

There’s also work by @ConnorBaker trying to add more SDK versions, which builds upon the work by @reckenrode. The hope is that packages will be able to depend on a newer SDK when necessary. This has been a major source of friction.

I just wanted to highlight this to make it clear that Darwin maintainers are sympathetic to problems caused for Linux-based contributors. They are problems for us too! Not only in the strictest sense but also in the sense that we are aware that every time Darwin holds up Linux we’re losing another slice of support from Linux-based maintainers who try their best to not break Darwin. I think I speak for all Darwin users if I say we’re grateful for any and all consideration the Darwin platform is shown! :heart:

26 Likes

A few months ago, the Haskell team in Nixpkgs decided to hide Darwin jobs in our daily build report:

https://github.com/NixOS/nixpkgs/pull/223042

Effectively, we no longer need to worry about the build status of Haskell packages on Darwin before merging haskell-updates into master.

Our main reasons for making this change were:

  • Hydra’s Darwin infrastructure is just so, so, so, so flakey. We constantly had to restart builds that died or were killed for some random reason on Darwin. This happens much less on Linux. Darwin builds can also be waaay slower than Linux builds. I don’t know if this is because there are more Linux build machines available, or if the Linux machines are just higher-spec, but it is frustrating having to frequently wait for Darwin builds.

  • This might be specific to the Haskell infrastructure, but there are relatively few true Darwin-based build failures. Haskell is relatively cross-platform (well, at least between Linux and Darwin), and it is pretty rare to have (1) an actual build problem caused because of some quirk of Darwin, and (2) the build problem be fixable.

  • Only one of the Haskell maintainers owns a Darwin-based development machine, so most of us aren’t able to fix things or really even confirm that user-submitted PRs work. (Except for using ofborg, hydra, etc. But that is really painful.)

  • Few users submit PRs fixing build problems on Darwin. Even when we alert users that their Haskell packages fail to build on Darwin, most users don’t care. Most users don’t have any way of debugging or fixing problems.

As for my own personal opinion about Darwin support in Nixpkgs, I feel exactly the same as @piegames:

However, with my Nixpkgs Maintainer hat on, I do like that Nix works reasonably well on Darwin. I like that non-Linux users are still able to use Nix/Nixpkgs. I like that those users are able to contribute back to Nixpkgs, and make improvements that benefit people on all systems.

I sometimes wonder if Nix/Nixpkgs supporting Darwin brings in an out-sized amount of users, attention, money, etc to the community. For example, even if the amount of Darwin contributors in Nixpkgs is small, maybe it is much easier for any given company to decide to adopt Nix (because it is common for companies to have developers that use both Darwin and Linux). If Nix/Nixpkgs never had Darwin support, maybe Nix would be much more niche than it currently is.

20 Likes

Do you happen to know off hand what the error was? While working on Tracking issue for Darwin stdenv LLVM update · Issue #234710 · NixOS/nixpkgs · GitHub, I’ve noticed that GHC in particular seems prone to bad file descriptor errors with clang. I had assumed they were due to changes in clang 16, but if they’ve been happening with clang 11, then that would actually be pretty helpful to know.

1 Like

I use both NixOS and Nix x86_64-darwin every day. I agree that macOS support is very important, but also agree that there are major issues stemming from it.

There was talk in the past of having separate Linux/Darwin channels. This would help a lot with freeing up Linux to progress more rapidly, but would offload a lot of work onto the Darwin maintainers, and the Linux channel would still require some monitoring from the Darwin side for critical packages (thinking of things like the Curl issues a while back).

It would be helpful to have some way to be notified of upcoming breakage ahead of time. If I’m not monitoring master and staging I usually only notice my maintained packages have broken when the breakage hits master. This affects more than just Darwin, it can be difficult to keep static/cross-compilation/exotic and anything else that doesn’t have ofborg CI continually working.

4 Likes

Do we have any numbers on the popularity of the respective platforms? The people replying in this thread, or even reading it at all, let alone that RFC, are probably not an accurate representation of all Nix users. What does the “silent majority” look like?

Maybe Mac users are disproportionately more “beginner”, less likely to seek out forums, discussions, or help.

Not that it would necessarily change anything, but if Mac is a “gateway drug” for NixOS (it certainly was for me, so maybe I’m just projecting), it would at least be good to know. And if it isn’t, that would also be good to know :slight_smile:

7 Likes

I doubt there is an ideal measure, but the community survey asks about platform use broadly. Here’s a screenshot of the chart in last year’s result thread

14 Likes

Absolutely! The benefit of reproducible developer environments on all machines is really appealing to companies, the only issues right now are maturity concerns, which are off-topic here.

If darwin support is dropped, the adoption rate would drop with it significantly, and I would have no argument in pitching nix to clients at all[1]. At that point, the projects will always choose the language-related package manager because that works reasonably well and consistently on both platforms.

[1]: Unless all developers were using Linux and it would be clear that they’d all be using Linux for the forseeable future, which is very rarely the case, and if it is the case the company relies on a single distro for very specific software and have no use for Nix anyway (i.e. a robotics company using ROS on Ubuntu).

18 Likes

I agree with the broad strokes of this, but I want to go a bit further: I think people see something symbolic in nix’s (in)ability to abstract over platform differences.

From the inside, we know there’s no magic. Someone has to get cut by a platform difference, figure it out, and fix it.

But from the outside, I think people semi-fairly read our success or failure on this point as saying something ~profound about the power/ability of nix’s conceptual model to actually eat the world.

(I’ve said elsewhere that I see nix as having a bit of an ice-nine property. It wants to restructure much of what it interfaces with. We can either help, or admit that the interface is out of scope.)

Edit: tbf, I think our ability to handle cross-compilation is a similar signal of this ability, though I think fewer people pose that question.

11 Likes

I use Nix/nixpkgs as the package manager for my development environments. My use case is one that takes advantage of many of the strengths of Nix/nixpkgs, and I believe it is a common use case since it’s a major point pitched by many tools in the nix ecosystem and products built on top of it, such as GitHub - jetpack-io/devbox: Instant, easy, and predictable development environments, https://devenv.sh/, https://www.riff.sh/, https://getfleek.dev/. Being able to use Nix/nixpkgs on systems that aren’t NixOS, including darwin, is crucial to this use case. If that wasn’t possible I’d be using https://asdf-vm.com/ instead.

I’m using nixpkgs on fleets of 4 systems: aarch64-darwin, aarch64-linux, x86_64-darwin, x86_64-linux. The darwin systems are running macOS, the linux systems are running Ubuntu. In terms of support, in my experience the darwin nixpkgs systems are about as well supported as the linux nixpkgs systems when used on a non-NixOS linux distributions. I do not use NixOS on those fleets.

All this to say that darwin support, in fact good darwin support, is something I care about a lot.

I see a lot of points in favor of darwin support and how we can make it better have already been made on the thread so I won’t reiterate them. I’m especially happy to see we’re doing some work to make darwin bootstrap / stdenv a bit more compact and self contained, I’ve definitely ended up changing the bootstrap while trying to fix something unrelated in the past, and had the suspicion the ease with which you can cause nixpkgs-wide rebuilds on darwin was one of the factors for long queue times for darwin jobs. I work in Australian time zone so actually I see the darwin queues be the same as the linux queues a lot of the time (for example at time of writing they’re all at around 8k jobs), but sometimes they jump up to 100k jobs and take days to drain.

I’ll bring up one point that I brought up on the RFC you linked from a while ago: having a system that isn’t linux and maintaining that helps nixpkgs crosssystem compatibility beyond that specific system, and helps prevent system specific idiosyncrasies from being dependended on / assumed. So having darwin as tier 2 support helps the support of any other non-linux system. In fact it’s also possible that it helps support of non-NixOS linux systems, given that darwin is the only supported non-NixOS system, but I don’t have specific examples of that.

10 Likes

Alas, the changes probably won’t help there. LLVM is a big dependency, and newer versions are definitely not smaller than older ones. My goal with the decoupling is to remove the need to bump bootstrap tools just to update the stdenv. As long as the tools are good enough to rebuild cctools and LLVM, that’s all it really needs. The other changes are more for maintainability.

I reworked the Darwin stdenv to follow the Linux stdenv’s patterns where it makes sense. Using assertions to catch mistakes during evaluation has been a major boon to my productivity. I don’t know that I’d have been able to make some of the changes I needed to make otherwise. I’ve also tried to be liberal with comments explaining what it’s doing. My hope is these changes will make working on the Darwin stdenv more approachable.

3 Likes

I don’t remember off-hand, but if you’re willing to comb through old Nixpkgs Haskell status reports, you’ll definitely find some status reports where there are huge numbers of things failing on Darwin for strange reasons. The status reports are generated every 6 hours based on the state of the haskell-updates jobset on Hydra. If you look through old commits you can find past status reports:

1 Like

I can’t really argue with your point that Darwin maintenance requires resources from people who use Linux.

I do want to.point out however that Darwin users (like me, if only occasionally) also contribute to Nixpkgs in a way that benefits Linux users.

9 Likes

Just to make sure that there is no miscommunication here: I am not saying we should drop Darwin support, or that Nix shouldn’t be used on Darwin machines or something. In fact I am totally in favor of supporting more platforms, including Darwin and WSL, if that helps some people.

And wanting these things is fine, but in the end somebody has to go ahead and put the work into it, because I won’t. This is up to the people who are invested into these platforms. (And some people already do great work here, thank you for that.) Note that Darwin and WSL are a bit special here because they are proprietary ecosystems; in contrast to other platforms for which I have no problems spending my time to some extent.

Platform support tiers are not a “goal” as in saying “yes we want this”, but they are a promise, a contract for developers and users about guarantees and expectations. And currently, Nix on Darwin does not fulfill the promises of a Tier 2 platform, and subjectively it has not improved by much in the last two years, especially regarding CI infrastructure. This makes Darwin effectively a Tier 3 platform with channel blocking.

Do you have some links to that? Given that channel blockers are a major point of contention, I’d be very interested in that. Unburdening maintainers who don’t use Darwin is kind of the goal though, and ideally the Darwin team would step up to fill that gap.

I totally agree that this is a pain, and that our pinging infrastructure needs to be worked on here. Unfortunately I don’t see many good solutions apart from “moar CI”, at the very least for evaluation.

4 Likes

Unpinning was necessary as we had many critical bugs and the release ongoing, I just repinned it.

As a release manager, Darwin maintainers interaction was suboptimal, I didn’t feel there was anyone from the team there to help the release management team and drive it. It was extremely frustrating.
My goto example is 🦦 NixOS 23.05 — Feature Freeze & Release Blockers · Issue #224457 · NixOS/nixpkgs · GitHub
Darwin maintainers didn’t bother answering the call…

2 Likes

:heart:

FWIW, I’m not saying it needs to be a permanently pinned issue. I realize it’s valuable real estate. I’m not sure how to reach more potential Darwin maintainers though.

That’s not a matter of not bothering. That’s about no one wanting Darwin issues blocking the NixOS release, combined with the fact that no one feels like they have the authority to speak for the entirety of Darwin maintainers. The darwin-maintainers team is like the Nixpkgs Maintainers team in that regard, not like the Haskell team for instance. It’s not a coordinated group with any structure or hierarchy.

1 Like

That’s something for the Darwin maintainers to figure out :stuck_out_tongue:.

I find this to be a weak argument honestly, sending multiple Darwin messages on this thread is probably not a big deal IMHO, it’s built for that. Silence is worse than noise on those aspects.

Honestly, I think that should change, someone has to be owner / take responsibility for it as we can see that it doesn’t work without it.

Though, I don’t think you need hierarchy or structure to go and say “hi, darwin maintainer here, we don’t see anything problematic, or we do have those issues”, release managers gets the final call here.

Otherwise, ignoring me is just making me more annoyed when I see potential release blockers and I have no idea how to evaluate them because I do not use Darwin.

I even suggested that Darwin maintainers should probably work on electing/choosing their Darwin release manager because of that.

But in all the cases, this @NixOS/darwin-maintainers doesn’t work very well in my experience, we ping them, but they don’t react in a timely fashion usually IIRC, which ends up putting the churn on annoyed Linux users who are blocked on Darwin related thing.

(and this was probably one of my biggest annoyance as a release manager for 23.05.)


And get me right: While I don’t care about Darwin, I do care about us providing Darwin for everyone and I can sympathize with people running Darwin (not necessarily by choice), though, I feel like:

The popularity earned by Darwin giving us NixOS contributors does not translate in significant manpower for Darwin maintenance or Linux contributions (please prove me wrong with data, I know I nerdsnipped someone on looking at this more seriously with data, so take this statement with a salt of grain.)

Therefore, I do not identify Darwin as an important aspect of the NixOS project beyond “users which are not contributors” metric, which is fine for people who can make money off this and invest in the project to improve the maintenance story.

Currently, the infra story (on ofborg side), the infra story (on community side, 1 remote builder provided by community members, I don’t even know if @winter pay electricity cost, etc. for everyone else) are insufficient in general. Some community members went and had some money to work on this project, I offered my own informal colo space here: macos build box for community use · Issue #493 · nix-community/infra · GitHub — again, I cannot be committed extremely with something I do not care about. If people built appliances, send to me in datacenter-style, I can host for them and if some crowdfunding funds the electricity cost, this is amazing. If not, you can also fund clouds to provide you with their spare Mac Mini (which are heavily quota’ed and in large demand), etc., etc.

But I think this post is the wakeup call for everyone who cares about Darwin to organize a plan to make this situation better before running into reopening the downgrade tier RFC.

4 Likes

I heard someone hyperbolically call the Darwin Maintainers team the embodiment of the bystander effect … I think your comparison to the Nixpkgs Maintainers teams is appropriate, as the Darwin Maintainers team is getting too large for simply pinging it like one would for the Haskell team. So we need an alternative here, and I think adding a bit of hierarchy to the Darwin team would probably not be a bad idea. Maybe have a label to add instead of a team ping and then there is some automation which helps triaging and assigning individual people or something?

3 Likes

The darwin-build-box is rented from Hetzner and reimbursed by the Nix :heart: macOS Open Collective.

It’s been established that if a response to a ping is not forthcoming Darwin-breaking changes can proceed. We can’t have 50+ people confirming they’re not able to help or not interested in a certain package.

Personally, I check every ping and if it’s important or trivial I act, otherwise I remain silent. If explicit negative signals are preferred I can try to be louder. It’s just that that’ll lead to kicking other Darwin maintainers’ shins when they’re already on the case.

I’m not sure a tag will improve matters since maintainers would have to actively seek it out and we don’t have that level of inertia yet AFAICT.

7 Likes

Same, I also check every ping, and only respond if I have some idea or context to jump in. Some of the comments make it sound like this happens all the time but looking at my email there’s only 10s of PRs per month where @NixOS/darwin-maintainers gets pinged (plus however many PRs where I ping it, since I don’t have email notifications enabled for my own messages), there’s really not that many.

3 Likes

I mentioned in the RFC before that darwin is the only non-linux non-gcc non-glibc platform with substantial support. How important is that capability? It might not be, and linux gcc glibc support would benefit greatly from not having the extra complexity of supporting other OS / kernels, c compilers and c stdlibs.

If it is important, then I think we can use that as a way to reduce the burden of darwin development in nixpkgs: linux + clang would cover llvm / clang, linux + musl would cover non-glibc, and freebsd or illumos could cover a separate os (incidentally they’re also part of the BSD family so they would help some aspects of darwin support). All three of those don’t require any specialised hardware and are not related to proprietary software.

If we had coverage of those three axis along which the current tier ~2 darwin systems are different from the tier ~1 linux systems, I believe the remaining darwin specific issues would be far fewer, and much more likely to be specific to darwin alone.

Note this isn’t an entirely practical suggestion. All three of those are currently less supported than darwin, and it’s possible any of three would be more effort to support than darwin, let alone all three of them together. But I want to raise it to highlight how the issue isn’t limited to darwin, but applies to all non-linux non-gcc non-glibc systems.

2 Likes