Nixos integration tests with graphical applications: Best practice

Lately, I’ve started to add tests for some graphical applications to nixos/tests in order to reduce manual maintenance and improve reliability. I couldn’t find any documentation or tutorials, so I just read the existing tests (the one for chromium is extensive) and fiddled around. There are some learnings and pitfalls, but I also have quite a few open questions. Maybe you can add your recommendations, or possibly resolve some questions?

The rough idea (I guess)

What I usually do is:

  1. Start the application as unprivileged user
  2. Do one nontrivial thing (e.g. for an editor: minimally edit a file)
  3. Test whether that action had roughly the expected effect

This will hopefully already catch a number of bugs like “The program segfaults on start”. I’m not sure whether it’s a good idea to extend the test further. The more extensive the test, the more features are tested. However, the test suite gets more brittle (more random CI failures) and is harder to maintain (changes in program behaviour necessitate changes in the test suite). What’s your opinion on the best balance here?

Learnings

  • Some programs start with a splash screen or some kind of startup wizard, since they are launched in a pristine VM and thus have never been run before. Often you can (and for sanity’s sake should) disable that with a command line option, or a custom config file which you plant in the user’s home directory.
  • Instead of clicking with a mouse button you can often use send_key("ret") to press a button.
  • Sometimes actions like clicking on something will take some time to have an effect. machine.sleep is not the best way to await the changes for that effect though, because it opens the door for race conditions: Whether the waiting time is sufficient depends on CI speed. Rather, you should use something blocking to await the effect of the action, e.g. wait_for_text or wait_for_file.
  • For more complex GUI interaction, especially with multiple windows, I guess one should really familiarise oneself with xdotool. (I for my part haven’t done so yet.)
  • During development, put plenty of calls to `machine.screenshot(“DebugN”), (where N=1, 2,… is a running index) in your test to see in what state it is.
  • During development, wrap your calls to succeed in a print statement to see what your test is doing.

Pitfalls & Questions

  • wait_for_text uses OCR to detect whether text has appeared, however this can sometimes fail for smaller fonts. What’s a good alternative in these cases?
  • If the test fails, it persists no screenshots, I believe. Likewise, you can’t access the screenshots while the test is running. How do I find out what happened in my failed test (short of commenting out all failures to get at the screenshots)?
  • The development cycle is lengthy: Re-running a test takes in the order of a minute, rather than a few seconds, because it boots a whole new VM. This makes development painful.
  • How do I debug a failed test on CI (ofborg)? In particular, how to debug an aarch64 test when I’m on x86_64? I can’t get at the screenshots or other artifacts. The most I can do is to try and print something, but that won’t tell me in what state the currently open windows are.
  • How do I click on a particular button in a dialogue (short of counting pixels and using xdotool to click)?
8 Likes

One feature that would be awesome: A test repl! I’d really like to start one VM, see the VM in a window, and then get dropped on a Python shell where I can enter new test lines. When I leave the shell, it should print my command history back at me so I can copy-paste it into the test and fix it up.

Is something like this feasible?

6 Likes

Further learning: wait_for_window is more reliable, faster, and less resource-hungry than wait_for_text. It was problematic for me though to use it if other users than root had started the window.

1 Like

Have you seen NixOS 23.11 manual | Nix & NixOS (the Running Tests interactively section)?

2 Likes

That’s really helpful! I wasn’t aware of that. Is it possible to attach to the X server as well somehow?

Doing nix-build nixos/tests/my_test.nix -A driver && result/bin/nixos-test-driver is already much better than restarting the VM from scratch. You can then use machine.send_key and xdotool to interact with the VM live.

Workaround for exporting the history: Press F3, copy-paste, and clean up the paste in a decent text editor.
Workaround for attaching to X: machine.screenshot("tmp"), and keeping an image viewer on the generated file tmp.png.

3 Likes

Recently, you can do:

$ nix-build . -A nixosTests.mytest.driverInteractive
$ ./result/bin/nixos-test-driver --interactive

See NixOS 23.11 manual | Nix & NixOS, which was updated in the meantime.

This is really helpful because it starts a QEMU window displaying the VM.

4 Likes

Just to be absolutely clear here in case anyone else overlooks it like myself when I did this for the first time a while back:

If you define users in your test machine(s) and do what @turion showed above, you can log in interactively and try/develop your test commands in the VM until you find something that works, then port that to the testScript after.

1 Like