Benchmarking: why nixos higher average cpu usage

I’ve been bench-marking Nixos against Arch linux, (also debian and clear linux). Doing day to day desktop tasks like watching youtube (firefox-wayland, firefox-bin and firefox), movies with mpv, using wine, and others. I have a minimalist install: sway (wayland) firefox, and not much else. Compared to say Arch linux (with similar settings), I found that on average, the 1-min average load, and cpu usage is between 30-50% higher in Nixos. Similar findings over my 5 PCs (laptops and desktops, Intel and Amd cpus).

I’m wondering why. Is it because of all the sym-linking that takes place behind the scene in order for programs to run? I read that was causing a problem for kde (but again I’m on a minimal setup with sway):
kde slow

Is it the kernel? Both Arch and Nixos have the same cpu governors setting (Schedutil).

The compilation flags of the binaries? I was looking at the CFLAGS used by Arch linux, and it’s nothing fancy (-O2, generic cpu).

5 Likes

I had the same feelings last time I tried NixOS: CPUs were busier and the system was consuming more power. I exported the kernel configuration options for both NixOS and Fedora on the same computer to see the differences but they were just too many and I gave up.

1 Like

just making sure: did you take a look at Accelerated Video Playback - NixOS Wiki ?

2 Likes

As an old archer using NixOS for several years now, I can’t say I have experienced this. The only increase in resource utilization I have noticed is storage (because of /nix/store), but even then, after a fresh garbage collection they are pretty close to equivalent.

I would highly doubt that symlinks cause much overhead, I can’t find it now, but I read an article about how synlink resolution is hyper optimized in the linux kernel to the point that is bascially a zero cost abstraction.

Perhaps there is something suboptimal about your configuration that hasn’t been pinpointed yet. My first guess would be graphics acceleration as @cmm mentioned. You could also try applicable profiles from nixos-hardware if that might help.

3 Likes

For reproducibility, we also disable a lot of architecture-specific optimizations. If you’re running a number crunching workload, this may impact performance a lot.

For firefox, maybe ensure that hardware acceleration is on?

I’m wondering why. Is it because of all the sym-linking that takes place behind the scene in order for programs to run?

That should only be occured once during program startup, unless the program is constantly lstating something.

1 Like

im sorry…i have question…just curious :slightly_smiling_face:

  1. is this benchmark compare nixos vs another distro with nix or nixos vs another distro without nix???
  2. is power consuming problem happen in another distro with nix installed???

Thanks for all your answers.

@cmm I had video hardware acceleration disabled on both Arch/Nixos. Enabled now.
But since it was not only on graphical tasks, it’s plausible that jonringer’s comment of disabled optimizations, and nrdxp’s mention of many diffs in kernel configs could explain the why.

@paklie I benchmarked Nixos vs Arch/Debian/Clear without nix. For 2, I didn’t test that case.

I have wondered if it might be useful to have a branch or something that enables all these optimizatons but that doesn’t get built by hydra (to avoid quadrupling its workload). If users want their system fully optimized for their architecture then could take a gentoo like approach of building everything from scratch.

Sounds annoying, but if you pushed everything to your own cache it might not be so bad.

3 Likes

That would be interesting.
As an aside, am I wrong in saying that Nixos would require more rebuilds than say Gentoo, because everytime a library changes, all things depending on it need rebuilding in Nixos, but not necesarily in Gentoo?

At least until this proposal is integrated:
https://discourse.nixos.org/t/content-addressed-nix-call-for-testers/12881

Yes and no. It depends on exactly what you change. Nixpkgs currently has a special staging branch for package changes that cause a massive rebuild of packages. Once the content addressed derivations are complete this will help alot, as you say.

I haven’t really bothered with compiler optimizations though because I really haven’t noticed any major differences, as I mentioned previously.

Some packages will pass specific cpu extensions to package builds. To do this generically, I demonstrated this in Tensorflow slower as NixOS native than inside a Docker container - #4 by jonringer. Essentially this will set the following flags on targetPlatform and hostPlatform.

    sse3Support    = featureSupport "sse3";
    ssse3Support   = featureSupport "ssse3";
    sse4_1Support  = featureSupport "sse4_1";
    sse4_2Support  = featureSupport "sse4_2";
    sse4_aSupport  = featureSupport "sse4a";
    avxSupport     = featureSupport "avx";
    avx2Support    = featureSupport "avx2";
    avx512Support  = featureSupport "avx512";
    aesSupport     = featureSupport "aes";
    fmaSupport     = featureSupport "fma";
    fma4Support    = featureSupport "fma4";

However, this will likely cause a specific machine to do a LOT of rebuilds. And not all packages will be “aware” of these extensions, and it’s not communicated to the build to enable the extensions for a given build.

Back to the original question. Some of these extensions may help a lot with video decoding, which may explain some of the increased cpu usage.

Another idea is to make videoDriver builds a little “impure”, since installation is already impure. Having the drivers compiled with as many optimizations as possible would probably help with compute usage.

7 Likes