Understanding rebuilds: jarring "screenshot failed" error on `nixos-rebuild switch --upgrade`

in short: I don’t think anything suspicious happened, but now I’m just hoping for a 101-guide/write-up on how packaging executes, how to understand rebuild logs; eg: how much of nixos-rebuild is tests, and how I can better understand (when I rebuild) which test is running and for which package it’s running. (and obviously: why did this failure happen?).


When I say it was “jarring” I mean: it caught my attention: “is some malicious code running a screenshot?” I now think the answer’s no. Here’s a snippet of the rebuild output error on output-line 1632 (apparently starting further up at line 1343; for clarity: raw log), inlined:

$ nixos-rebuild switch --upgrade
# ... logs snipped for brevity ...
>           raise ScreenShotError(msg)
E           mss.exception.ScreenShotError: Unable to open display: b':99'.

__class__  = <class 'mss.linux.MSS'>
display    = b':99'
kwargs     = {'display': ':99'}
msg        = "Unable to open display: b':99'."
# ... logs snipped for brevity ...
Adding rules for package /nix/store/f1hysiasn2hnj6gk6dy9dhyvmgzy1djm-libinput-1.27.0
Copying /nix/store/f1hysiasn2hnj6gk6dy9dhyvmgzy1djm-libinput-1.27.0/lib/udev/rules.d/80-libinput-device-groups.rules to /nix/store/bmxrc2yr4irwglwv6yv8yc2l29k6ffky-udev-rules/80-libinput-device-groups.rules
# ... logs snipped for brevity ...
Checking that all programs called by absolute paths in udev rules exist... OK
self       = <mss.linux.MSS object at 0x7ffff5c012a0>

src/mss/linux.py:319: ScreenShotError
=========================== short test summary info ============================
FAILED src/tests/test_find_monitors.py::test_keys_monitor_1 - mss.exception.ScreenShotError: Unable to open display: b':99'.
============ 1 failed, 58 passed, 10 skipped, 7 deselected in 4.92s ============
building '/nix/store/bvghfh9rz1if6dmxmyzyp5920v0mikql-X-Restart-Triggers-systemd-udevd.drv'...
$ [[ $? -eq 0 ]]
1

This appears to be an error from python-mss package, and nix-tree shows I’m depending on python3.12-mss-10.0.0 indirectly because I directly install yubioath-flutter-helper-7.1.1. I’m just now wondering: is this an actual packaging bug I should try to help fix? How do folks more easily find the nix packaging code responsible for a given line in output they see? (Example: it took me quite a while to figure out (99% sure) that it’s this perfectly-legit-looking checkPhase block that is running the tests).

More questions: interestingly running the build command again in another terminal (I was trying to reproduce), I hit no issues. So ten what’s the functional purpose of the test failing if another rebuild has no issue?

Okay partially answering some of my own questions…

101 “how do things work” questions…

tl;dr lots of phases and every discrete package has at least one chance to run a test suite

… from nixpkgs manual on the “phases” of std.mkDerivation:

This generic command either invokes a script at buildCommandPath, or a buildCommand, or a number of phases. Package builds are split into phases to make it easier to override specific parts of the build (e.g., unpacking the sources or installing the binaries).

and then just a handful of paragraphs down, the order in which phases run by default:

$prePhases unpackPhase patchPhase
$preConfigurePhases configurePhase $preBuildPhases buildPhase checkPhase
$preInstallPhases installPhase fixupPhase installCheckPhase
$preDistPhases distPhase $postPhases

So that’s helpful: there’s not only checkPhase but also installCheckPhase.

… And then a few more sections down, an explanation of the checkPhase:

The check phase checks whether the package was built correctly by running its test suite. The default checkPhase calls make $checkTarget, but only if the doCheck variable is enabled.

It is highly recommended, for packages’ sources that are not distributed with any tests, to at least use versionCheckHook to test that the resulting executable is basically functional.


rebuild-correctness question…

I’m guessing this just means the test is flaky/non-deterministic.

Or: there’s a bug in phase-implementation that allowed the build to stay around. That is: my second attempt had almost zero output, which I’m guessing means everything was already installed and built, and so no phases triggered… but then that makes me think the phases that failed the first time around didn’t actually trigger any rollback of the states they were changing.

Would love some input here, since this feels a little too naive of a bug for others to not have hit yet.


logs-clarity questions…

This hope of better understanding logs lead me to this 7 year old “interactive rebuild” post/feature-request, since there’s just too many interleaved logs to try to make sense of things sometimes. So I guess interactivity isn’t a solution.

But even more interesting is this reply about rollbacks:

Which makes me think my “too naive” thought above might actually be right: failed rebuilds really don’t rollback by default? :thinking:

nix-output-monitor is a good way to investigate this.

1 Like

Wow nom I just gave it a try and that’s quite a big help. Thanks for the tip!

edit: Ha neat: since nom is operating on stdin anyways (in the case of nixos-rebuild that seems to be my only option), I was able to just pass those original logs into nom and see how it parses things. This instant output below would’ve saved a lot of head-scratching and time:

┏━ Dependency Graph showing 6 of 15 roots:
┃       ┌─ ↓ ⏸ flutter-sdk-flutter_test waiting for 3 ↓ ⏸
┃       ├─ ↓ ⏸ flutter-sdk-sky_engine waiting for 1 ↓ ⏸
┃       ├─ ↓ ⏸ flutter-sdk-flutter_driver waiting for 1 ↓ ⏸
┃       ├─ ↓ ⏸ flutter-sdk-integration_test waiting for 1 ↓ ⏸
┃       ├─ ↓ ⏸ flutter-sdk-flutter waiting for 1 ↓ ⏸
┃       ├─ ↓ ⏸ flutter-sdk-fuchsia_remote_debug_protocol waiting for 1 ↓ ⏸
┃       ├─ ↓ ⏸ flutter-sdk-flutter_localizations waiting for 1 ↓ ⏸
┃       │  ┌─ ↓ ⏸ flutter-wrapped-3.24.4-sdk-links waiting for 2 ↓ ⏸
┃       ├─ ↓ ⏸ flutter-sdk-flutter_web_plugins
┃    ┌─ ↓ ⏸ yubioath-flutter-7.1.1-package-config.json
┃ ┌─ ⏸ yubioath-flutter-7.1.1-package-config-with-root.json
┃ │     ┌─ ↓ ⏸ flutter_immutable waiting for 1 ↓ ⏸
┃ │  ┌─ ↓ ⏸ flutter-wrapped-3.24.4
┃ │  ├─ ⏸ flutter-cache-dir
┃ ├─ ⏸ flutter-wrapped-3.24.4-sdk-links
┃ │  ┌─ ⚠ python3.12-mss-10.0.0 failed with exit code 1 after ⏱ 0s
┃ ├─ ⏸ yubioath-flutter-helper-7.1.1
┃ ⏸ yubioath-flutter-7.1.1
┃ ✔ gdm-autologin.pam
┃ ✔ user-environment
┃ ✔ uhk-udev-rules-4.1.0
┃ ┌─ ✔ desktops
┃ ├─ ✔ unit-script-display-manager-start
┃ ├─ ✔ xserver.conf
┃ ✔ unit-display-manager.service
┃ ┌─ ✔ nixos-help
┃ ✔ nixos-help
┣━━━ Builds             │ Downloads          │ Host
┃    ⏵ 12 │ ✔ 32 │      │     │       │      │ localhost
┃         │      │      │     │ ↓ 359 │      │ https://cache.nixos.org
┗━ ∑ ⏵ 12 │ ✔ 32 │ ⏸ 21 │ ↓ 0 │ ↓ 359 │ ⏸ 12 │ ⚠ Exited after 1 build failures at 22:36:04 after 3s

(it highlights python3.12-mss-10.0.0 read and shows it clearly as a child of yubioath).