Nix đź–¤ macOS Monthly

May 2021

This month was spent getting the bootstrap-tools and stdenv to build with LLVM 11. This required quite a few patches essentially flipping clang switches back to the clang-7 behavior, like -fcommon.
I based this work on staging, which got the Apple Silicon PR merged in time for the 21.05 release, in an attempt to make it in the release as well but got stuck on two regressions introduced in that PR.
These were quickly patched by @thefloweringash :heart:.
On @LnL7’s advice I’ve rebased the LLVM bump PR on a specific Nixpkgs master commit so we can have a hydra jobset that compares well with existing evaluations and finish up this work.

In other news, the Apple Silicon PR was merged in time for release 21.05 :tada:.
This means we’re getting really close to officially supporting Aarch64 macs, pending some fixes to enable Aarch64 hydra builds and a Nix release.
There has also been progress on getting NixOS tests to run virtualized on Darwin by r2r-dev.

19 Likes

I think that’s done now, at least for the short term. NixPkgs branches covered:

1 Like

Nix stable release (for aarch64-darwin) is worked on here, but it seems to require a change to the nix jobset in hydra.

This is great! Thanks for posting these monthly updates.

2 Likes

June 2021

The Hydra build of the LLVM bump progressed very slowly because the minis were swamped by a couple staging rebuilds, so I focused on the SDK bump. I rebased the previous work on the LLVM bump and started updating the Apple source releases. The source releases are what Nixpkgs tries to depend on for macOS builds when possible. Updating them involves adding to the header lists the releases are checked against when new headers are added, as for CommonCrypto this time around. Sometimes headers change location in the source tarball, requiring an update to the buildPhase, like for the new libdispatch. And occasionally headers are removed in new version, in this case we create a package that combines headers from the new and old version so code that relies on these headers still compiles, as happened to libdispatch and xnu.

I looked into building packages against newer SDK frameworks, which came up in this issue. But I got stuck on the rewriting of dylib paths and didn’t see a clear way forward. Finally with most of the Hydra build for the LLVM bump finished I was able to start looking into fixing the cups build which caused most of the new build failures due to being a common (transitive) dependency, I have a patch but ended up running into failures the hydra builders haven’t encountered and am looking into this. The new Hydra evaluation has started.

I also wrote a tutorial to get people started on recent versions of macOS (soon regardless of CPU), with step-by-step uninstallation instructions to reduce the fear of ending up with a cluttered system or make it easy to start from scratch. This is intended to go into the Nix manual. A nix-darwin setup tutorial would be next.

8 Likes

July 2021

This month was spent focusing on the LLVM bump. The latest Hydra Evaluation is promising. The most common build failure in this latest evaluation is icoutils and fixing it required a patch to Libc which has revealed an impurity in the build for libpulseaudio, two steps forward, one step back. But we’re getting closer to an LLVM 11 stdenv on Darwin. If there’s packages in the evaluation that are failing to build but are important to you, patches are welcome.

I also looked into why Firefox fails to build on Darwin. First stumbling block is jemalloc and I’m talking to maintainers. So as a temporary workaround I’ve uploaded the expression I use, which fetches the latest app in a dmg from mozilla.org, to NUR. Making use of this should be as simple as putting an override in your config:

{
  nixpkgs.config.packageOverrides = pkgs: {
    nur = import (builtins.fetchTarball "https://github.com/nix-community/NUR/archive/master.tar.gz") {
      inherit pkgs;
    };
  };
}

And then you can use nur.repos.toonn.apps.firefox to install the package. (For setups without nix-darwin take a look at the NUR installation instructions.) The expression has a version argument so it’s possible to install a specific Firefox version, every version since 86.0, like so, nur.repos.toonn.apps.firefox.override { version = "86.0"; }.

12 Likes

August 2021

At the start of the month I finished up some work on the LLVM bump. I patched Libc to define TARGET_OS_EMBEDDED if it is not defined because LLVM 11 is stricter about the use of undefined identifiers. I also updated the libpulseaudio expression to build without impurities. This should take care of most of the major breakages introduced by the LLVM update. I’m currently waiting for a Hydra evaluation to confirm this but the x86_64 Darwin builders are swamped by a stdenv rebuild due to an OpenSSL update and Haskell packages updates.

Most of my time was spent on troubleshooting the configd build, still working on this branch though I’ve been going back and forth on changes recently, have yet to distill them into commits to push. Updating the Apple source releases is an integral part of bumping the SDK version, in Nixpkgs they are an amalgam of packages from different version of macOS, our XNU has header from two versions of XNU, for example, and Security and configd haven’t gotten updated for several major versions of macOS. Setting out to update all of them to the versions available for macOS 10.13.6 I quickly ran into stumbling blocks. Some packages have changed considerably, in the case of hfs I decided not to pursue updating it now because it would be too time consuming, most of the old headers simply aren’t present in the newer release and it’s not clear what it does provide now. Configd has also proven to be rather difficult, it uses functions that are missing unless -DPRIVATE or -DKERNEL are provided but neither of these seems to be intended for third-party builds, the Makefile only provides -D__OPEN_SOURCE__. With both those flags configd starts depending on headers from Neon, a C HTTP/WebDAV client library implementation, which is available on opensource.apple.com but we can only guess at the version that corresponds to a certain version of macOS.

This means the source releases Apple provides aren’t necessarily even buildable and I’ll probably hold off on updating configd as a consequence. During this troubleshooting I did find out the AvailabilityInternal.h header Apple provides in the XNU sources is no longer updated since macOS 10.12.6. Systems running later versions of macOS do have an updated header locally but the open source release is missing several definitions used by other open source releases. I originally tried fixing this by patching the missing definitions into the header, which could be done programmatically, but I quickly realized why Apple changed to a different definition for availability macros, the old approach leads to a combinatorial explosion of definitions for all combinations of macOS versions, the header on my system has 10k more lines than the header in the open source release and this difference would only get bigger with newer versions. So instead I opted to update the relatively fewer places where the old-style availability macros were used for newer versions to using the newer macros. This made it possible to update Security, which was at the version corresponding to macOS 10.9.5 with a note stating it was to be updated as soon as the problems were figured out, 4 major versions of macOS later I think we did it : )

18 Likes

September 2021

The last issue for the LLVM bump is I had patched two packages, wheel and sphinx, because of a hash mismatch. I figured the tarballs had simply changed and required new hashes but changing the hash caused a similar hash mismatch on Linux. Turns out this problem was due to the file systems involved, which symphorien and VladimirCunat helped me realise. On Linux unicode is usually encoded using UTF-8 and normal form C (NFC). The HFS+ file system, which was the default on macOS until APFS replaced it in High Sierra (10.13), uses UTF-16 and normal form D (NFD). This means hashing the same file name across these file systems can have different results. Both these python packages contain tests using filenames with characters that are represented differently in NFC and NFD, so I patched their sources to use normalization-resistant characters, this seems to have worked fine for wheel, but not for sphinx, looking into it though. Ideally upstream will be interested in these changes otherwise we’ll have to carry the patches to guarantee these packages are FODs. A more general solution would be to change the hashing Nix does to normalize all unicode in a fixed way. Not sure this is a path we want to walk though, there are strings which do not round-trip through normal forms so it’s not 100% guaranteed to work.

As for the SDK bump, configd remains a hard nut to crack. Somewhere around macOS 10.13 Apple has stopped releasing all the headers that are necessary to build it. XPC no longer seems avoidable as a dependency, CoreFoundation seems incomplete (we rely on Darling for the missing bits now), there’s some packages hosted on opensource.apple.com but not listed in the macOS releases, like neon and OpenBSM. I don’t see an alternative to getting the XPC headers from the SDK and likewise for some other missing headers. There’s several projects online where missing headers are worked around by stubbing, like OSXPrivateSDK and GoVPN, but it doesn’t seem like a good way to deal with the issue. Fabricating a constant can lead to unexpected behavior. That’s why I intend to use binaries from the SDK whenever dependencies aren’t available. This will be a step back with regards to building from source unfortunately.

16 Likes

October 2021

Up to this point I’ve been developing the Apple SDK bump on top of my PR bumping LLVM in the x86_64 Darwin stdenv from 7 to 11, which is undergoing review now. However, NixOS seems to have a policy of not bumping major things, like the LLVM version, between releases within a year (@sterni’s comment on the PR). This means the LLVM 11 bump will have to wait until after the 21.11 release has more or less been finalized. Once that’s done we’ll be able to work on bumping the default LLVM for Linux too. Side Note: LLVM 13 is already on the horizon and Rust 1.56 depends on it. Rust has also become a dependency of Sphinx, through the Python cryptography library. Harmonizing with this version would avoid having to wait for two LLVM builds in Hydra evaluations, which would be great.

After finding out the LLVM bump couldn’t make it into the next release I shifted my focus back to the Apple SDK bump, rebasing it on Nixpkgs without my LLVM bump. The intent was to make it possible to merge before the release and the next step was to add earlier unpacking of the SDK, so we can substitute parts of Apple’s open source releases that do not build. This is where the first hurdle has come up. The Darwin stdenv depends on cmake, which depends on libuv and libuv’s configure phase tries to compile a simple conftest and ld errors out on an absolute path to CoreFoundation, which seems to be passed in by clang. Apple SDK 10.13 switched to providing only .tbd files, no object files. This causes the linker to fail to find CoreFoundation at the absolute path. Clang 7 does support text-based stubs and libuv uses the recommended approach of using dlopen() to load dynamic libraries so seemingly this absolute path being passed to ld is all that’s standing in the way of having it build. There’s very little time left before the release and there’s more work left to do getting the source releases to build. I’m going to focus on building the source releases with an LLVM 11 stdenv.

In other news, RFC 112 is proposing to demote x86_64 Darwin from Tier 2 to Tier 3 platform. There is no intent to decrease the support for Darwin but this would be a more honest reflection of the current status. Darwin CI is regularly running into problems where builders are idle but the queue runner isn’t scheduling jobs for them. The main problem this causes is that some channel updates depend on a set of Darwin jobs passing and this ends up delaying the channel advance. In order to alleviate the parts of the issues raised that can be addressed by the community, Domen has started a call for Darwin maintainers.

Nix 2.4 was released recently and it comes with installer improvements, especially on Darwin. Unfortunately it does have a hard to debug issue on Darwin. I’ll be trying to find a way to reproduce this issue but extra eyes are needed.

Edit: Add call for Darwin maintainers.

16 Likes

llvm is listed as a release critical package because many downstream packages (e.g. rustc) are very sensitive to how it’s structured and its feature set. So a major change can’t be merged 1 to 5 weeks before branch-off according to the release schedule.

Adding potential regressions while “trying to stabilize for a release” are at odds with each other. It was added after several dozen fixes were needed after a [well intended and very needed] refactor was done to llvmPackages right before the 21.05 release.

AFIAK, there’s nothing else blocking an update to llvm other than someone taking the time to make it happen.

6 Likes

November 2021

At the start of the month I was spending time on getting the SDK bump to work on LLVM 7, this was not fruitful and the only reason was making the 21.11 cut-off deadline so I dropped it. Instead I helped out with ZHF and finished up work on the LLVM 11 bump, which has been merged to staging. Stabilizing is ongoing on staging-next to shake out any remaining problems, like breaking MariaDB because it was updated.

Domen’s call for Darwin maintainers had a good response. The darwin-maintainers team has grown to almost 40 people, this should help with both response times and number of eyeballs on any Darwin issues that pop up. Five new Darwin builders were added to OfBorg to help with Darwin maintenance, very useful for maintainers without access to a Darwin system. These builders are sponsored by MacStadium.

@pxc asked me to shout out @bergkvist’s work on generating small binary wrappers. Darwin does not allow for interpreted scripts to be used as shebang interpreters. Using a binary wrapper for programs that are commonly used as shebang interpreters, like Python, Perl or Ruby, would workaround this limitation.

12 Likes

December 2021

The LLVM bump caused a couple regressions on aarch64-darwin because of generic isDarwin guards I removed, but these were rather easily fixed.

With the LLVM bump completed I’ve been able to focus on the SDK bump. After rebasing libuv, a dependency of the bootstrap-tools, stopped building. I found and fixed the problem during the last week and am now at the point where I can try substituting parts of the SDK distributed by Apple for open source releases that don’t build. This should unblock the SDK bump and allow us to move forward.

11 Likes

January 2022

This month I focused on the SDK. I set out to substitute configd with the SystemConfiguration framework from the SDK. This means the framework and its dependencies need to build from the bootstrap-tools so I had to add print-reexports to the bootstrap tarball. Then I ran into the familiar issue with XNU no longer providing all the availability macros and I set out to patch the affected parts of the SDK. After making headway through two or three frameworks the realization of how many locations that would need to be patched set in so I needed a different approach. I’ve opted to patch Availability.h to redefine availability macros in terms of the modern __API_AVAILABLE style of macros, which are backed by the availability attribute provided by modern compilers. This means we give up the ability to compile things on Darwin that make use of the source releases with older compilers. Next step is figuring out why PCRE doesn’t have pthreads available anymore when being built in stage 3.

9 Likes

February 2022

While waiting for builds to test the hybrid open source releases and SDK frameworks approach, a particularly slow phase of the Darwin stdenv build stood out to me. Libsystem is mostly just a collection of headers from other open source releases. Simply copying some header files from other derivations seemed like it shouldn’t take a significant amount of time. Turns out this assumption was wrong. From my rigorous scientific testing, i.e., running the Libsystem build once with the time command (only Libsystem, not including dependencies), using cpio in the Libsystem build made it take ~250 minutes, or 4 hours and 10 minutes. That’s a lot of minutes.

The motivation for using cpio seems to be its -p flag for “pass-through” mode. This mode accepts paths on stdin and copies the files to a target directory, without creating or extracting an archive. What’s special about this is that it can preserve the relative path when doing so, whereas cp A/B C would copy B to C, echo A/B | cpio -pd C would result in the copy being at C/A/B. As I found out, doing this with other tools is not straightforward. It seems like rsync is the only other copying tool with close to drop-in compatible behavior. However, that’s a more complicated program than we want in the bootstrap-tools.

So I had to roll my own shell function based on cp --parents, which has a similar function but preserves the source’s read-only permissions making it impossible to copy multiple files into the same subdirectory. With the current implementation of CopyHierarchy build times went down to ~75 seconds, this time I re-ran nix-build --check many times because I couldn’t believe the difference. Yes, on my aging machine with a 7200 RPM HDD this is a 200x speedup.

Now that Nix 2.4+ has been released and the broad changes to the manual have made their way to the documentation it has finally become practical to incorporate the uninstallation instructions I wrote up in June. During the discussion of these changes expanding the installer to include uninstallation capabilities came up again. @abathur’s call for contribution to the installer, in particular adding uninstall capabilities, hasn’t been fullfilled yet so it’s still outstanding.

25 Likes

March 2022

Since switching to using the SystemConfiguration framework from the SDK as configd, I’ve been getting the following error when building bootstrapped-pip, a dependency of cups, which I’ve been using as a good litmus test for the macOS frameworks:

Sourcing pip-install-hook
Sourcing setuptools-build-hook
Using setuptoolsShellHook
@nix { "action": "setPhase", "phase": "unpackPhase" }
unpacking sources
unpacking source archive /nix/store/3dki8ikais8yg1npyrznzd39vpg66d79-wheel-0.37.1-source
unpacking source archive /nix/store/nh1m8hjdbrcmslac09dwyk589cp4w1fb-pip-21.3.1-source
unpacking source archive /nix/store/yv7chq5668yzjdzgjiy8kf4sbgavkrqj-setuptools-57.2.0-sdist.tar.gz
source root is .
setting SOURCE_DATE_EPOCH to timestamp 1648839464 of file ./.sandbox.sb
warning: file ./.sandbox.sb may be generated; SOURCE_DATE_EPOCH may be non-deterministic
@nix { "action": "setPhase", "phase": "patchPhase" }
patching sources
@nix { "action": "setPhase", "phase": "configurePhase" }
configuring
no configure script, doing nothing
@nix { "action": "setPhase", "phase": "installPhase" }
installing
Building setuptools wheel...
/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/setuptools /private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0
Traceback (most recent call last):
  File "/nix/store/fkvh5szkvi5c86936p58gk9v0nhlq5gd-python3-3.9.10/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/nix/store/fkvh5szkvi5c86936p58gk9v0nhlq5gd-python3-3.9.10/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/__main__.py", line 29, in <module>
    from pip._internal.cli.main import main as _main
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_internal/cli/main.py", line 9, in <module>
    from pip._internal.cli.autocompletion import autocomplete
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_internal/cli/autocompletion.py", line 10, in <module>
    from pip._internal.cli.main_parser import create_main_parser
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_internal/cli/main_parser.py", line 8, in <module>
    from pip._internal.cli import cmdoptions
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_internal/cli/cmdoptions.py", line 23, in <module>
    from pip._internal.cli.parser import ConfigOptionParser
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_internal/cli/parser.py", line 12, in <module>
    from pip._internal.configuration import Configuration, ConfigurationError
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_internal/configuration.py", line 20, in <module>
    from pip._internal.exceptions import (
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_internal/exceptions.py", line 8, in <module>
    from pip._vendor.requests.models import Request, Response
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_vendor/requests/__init__.py", line 135, in <module>
    from . import utils
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_vendor/requests/utils.py", line 28, in <module>
    from ._internal_utils import to_native_string
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_vendor/requests/_internal_utils.py", line 11, in <module>
    from .compat import is_py2, builtin_str, str
  File "/private/tmp/nix-build-python3.9-bootstrapped-pip-21.3.1.drv-0/pip/src/pip/_vendor/requests/compat.py", line 63, in <module>
    from urllib.request import parse_http_list, getproxies, proxy_bypass, proxy_bypass_environment, getproxies_environment
  File "/nix/store/fkvh5szkvi5c86936p58gk9v0nhlq5gd-python3-3.9.10/lib/python3.9/urllib/request.py", line 2620, in <module>
    from _scproxy import _get_proxy_settings, _get_proxies
ModuleNotFoundError: No module named '_scproxy'

Our Python build is not configured to create _scproxy, so urllib can’t import it. This is intentional, we don’t pass configd to the CPython expression when building python3Minimal because that wouldn’t be minimal. I’m not sure whether it would be desirable anyway since it would require Python to depend on the SDK, which would mean not all of the source code that goes into it is available. This used to be patched out of the Python used for bootstrapping like so:

   substituteInPlace Lib/urllib.py --replace "if sys.platform == 'darwin'" "if False"

But that line was removed in this commit. I’m not sure why this problem is surfacing again now since it has been working fine ever since. I’m still looking into this.

I have considered simply dropping the source releases altogether, as was done for aarch64-darwin but this isn’t trivial and would mean dropping most of the open source parts of the Darwin stdenv.

4 Likes

What are the implications of this? Would all packages that require macOS SDK frameworks (or even just the darwin stdenv) become nonfree? It seems like holding on to the open source parts of darwin is increasingly a losing battle.

1 Like

I am actually unsure of the licensing situation. I believe the SDK is supposedly built from the source releases. And AFAIUI Apple’s license is BSD-like. However, parts of the SDK, like XPC don’t have released sources, though we do need XPC in particular if we go with the source releases too. So I think linking to the SDK is fine. Although some source releases, like network_cmds, aren’t part of it. And some headers might only be in the source releases, I think that may be the cause for the scproxy problems.

As far as I can tell, the command-line tools (not sure about Xcode) make an exception only for the open-source components (and I have no idea how that works when the open-source components are incomplete or unavailable). What about the other stuff though? How does that work for e.g., Metal or CoreImage?

April 2022

Something that stood out with the _scproxy issue was the involvement of a bootstrap stage Python in the bootstrapped-pip build. This raised my suspicion that something was wrong with the Python build. Turns out there were errors preventing _scproxy.c from being compiled but these errors were silently swallowed and did not result in a failing build.

The reason for the errors was an issue with the Availability macros. Apple stopped updating them in the open source releases, probably because of the combinatorial explosion of macros with each new version of macOS. They’ve since switched to newer macros, which are backed by the availability attribute in modern Clang and GCC.

My fix is redefining the old macros using the new-style availability attribute. This finally fixed the Python build and completed the substitution of configd by SystemConfiguration from the SDK. Now I can continue bumping Apple’s open source releases.

16 Likes

May 2022

With configd out of the way I was finally able to focus on bumping the actual Apple open source releases we rely on to construct our own source-based SDK. This went swimmingly. I’ve updated about half of the source releases and tracked down some of the headers we were missing, so our SDK will be more complete. So far I haven’t encountered any major, new roadblocks—knock on wood.

19 Likes