Avoid Linux locking up in low memory situations using earlyoom

I really needed it, my laptop with 8 GB of memory can’t handle nix-env on nixpkgs-unstable anymore :scream_cat: :scream_cat: :scream_cat:

11 Likes

I use this in my build wrapper script which obviously only addresses the nix part, but is helpful nonetheless:

MEM=$(free --giga | grep Mem | gawk '{print $2}')
if [ "$MEM" -lt 12 ]; then
  echo "Detected memory: ${MEM}GB. Lowering GC heap size"
  export GC_INITIAL_HEAP_SIZE=1m
fi

4 Likes

Thank you! This seems useful, the default garbage collector size is 384M per Common Environment Variables - Nix Reference Manual

I understand your command saves up to 383 MB of memory for a nix process? :thinking: Seems useful for my systems with 1 or 2 GB!

1 Like

I understand your command saves up to 383 MB of memory for a nix process?

I don’t know enough about the specific garbage collector or how the various nix commands do allocations to give you anything tangible in terms of quantifiable memory savings.

My only data point - on an 8GB laptop, it would go into swap hell when running other things and doing nixos-rebuild without this. With this, the laptop would remain usable.

1 Like

Hi everyone, is this still working well for you?
Do you also disable systemd.oomd or can they live together?

Thanks!

I use earlyoom, it works. It fires well before systemd would oom-kill.

There is services.swapspace which is tagentially related, it allows using your storage as swap when it detects memory is below set threshold and releases the storage space afterwards.

Another thing I’d encourage looking into is increasing the memory thresholds for the nix daemon systemd service. I was struggling with OOM conditions for a while until I increased the memory thresholds available to the daemon, which made the daemon survive perfectly fine in almost all cases without degrading performance. 12GB laptop, and of course, ymmv. I also use zram for swapping purposes.

Zram reduces available memory and keeps the cpu tied up longer in swapping, if anything it’d make builds more likely to stall in a different way (not technically OOM but memory pressure would force quasi-thrashing).

This tool is a blessing. Summary for NixOS from the blog post to prevent Linux freezes from out of memory situations:

{
  services.earlyoom = {
      enable = true;
      freeSwapThreshold = 2;
      freeMemThreshold = 2;
      extraArgs = [
          "-g" "--avoid '^(X|plasma.*|konsole|kwin)$'"
          "--prefer '^(electron|libreoffice|gimp)$'"
      ];
  };
}
4 Likes

Thanks everyone, so in a nutshell, you enable oom and earlyoom + in some cases zram, therefore, all of them can be used together.

I did a very conservative change, and the situation has improved, if I see it deteriorates, I’ll enable earlyoom as well:

{
  zramSwap.enable = true;
  systemd.oomd.enableUserSlices = true;  # take action on user-space process hierarchies
}

getting the error

systemctl status earlyoom
× earlyoom.service - Early OOM Daemon
     Loaded: loaded (/etc/systemd/system/earlyoom.service; enabled; preset: ignored)
    Drop-In: /nix/store/wb3d1kqsqlfxmq6kyf4ihzv62lj24njk-system-units/earlyoom.service.d
             └─overrides.conf
     Active: failed (Result: exit-code) since Fri 2025-05-16 19:39:25 IST; 4min 17s ago
   Duration: 54ms
 Invocation: b0f68ef5c4c8400b964813281ca728c8
       Docs: man:earlyoom(1)
             https://github.com/rfjakob/earlyoom
    Process: 2006 ExecStart=/nix/store/cpl768r9vzf8zdyg8zh9lziab4ps3y42-earlyoom-1.8.2/bin/earlyoom $EARLYOOM_ARGS (code=exited, status=13)
   Main PID: 2006 (code=exited, status=13)

May 16 19:39:25 leptup systemd[1]: earlyoom.service: Scheduled restart job, restart counter is at 5.
May 16 19:39:25 leptup systemd[1]: earlyoom.service: Start request repeated too quickly.
May 16 19:39:25 leptup systemd[1]: earlyoom.service: Failed with result 'exit-code'.
May 16 19:39:25 leptup systemd[1]: Failed to start Early OOM Daemon.

with

  services.earlyoom = {
    enable = true;
    freeSwapThreshold = 2;
    freeMemThreshold = 2;
    extraArgs = [
      "-g"
      "--avoid '^(X|init|Xorg|ssh|gnome.*|ghostty|kwin)$'"
      "--prefer '^(electron|libreoffice|gimp|zen*)$'"
    ];
  };

i use Gnome btw

1 Like

Can you past the result of

$ journalctl -xu earlyoom

Should look like

May 11 21:06:53 pc systemd[1]: Started Early OOM Daemon for Linux.
░░ Subject: A start job for unit earlyoom.service has finished successfully
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit earlyoom.service has finished successfully.
░░
░░ The job identifier is 8940.
May 11 21:06:53 pc earlyoom[82412]: earlyoom 1.8.2
May 11 21:06:53 pc earlyoom[82412]: Preferring to kill process names that match regex '^(hls.*)$'
May 11 21:06:53 pc earlyoom[82412]: Will avoid killing process names that match regex '^(X|xmonad.*|firefox.*)$'
May 11 21:06:53 pc earlyoom[82412]: mem total: 23788 MiB, user mem total: 22214 MiB, swap total: 8191 MiB
May 11 21:06:53 pc earlyoom[82412]: sending SIGTERM when mem avail <=  6.00% and swap free <=  2.00%,
May 11 21:06:53 pc earlyoom[82412]:         SIGKILL when mem avail <=  3.00% and swap free <=  1.00%
May 11 21:06:53 pc earlyoom[82412]: killing whole process group 4428 (-g flag is active)
May 11 21:06:53 pc earlyoom[82412]: mem avail: 15173 of 22213 MiB (68.31%), swap free: 8191 of 8191 MiB (100.00%)

i presume the options passed are incorrect?
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The unit earlyoom.service has entered the 'failed' state with result 'exit-code'.
lines 1-25...skipping...
May 16 18:00:41 leptup systemd[1]: Started Early OOM Daemon.
░░ Subject: A start job for unit earlyoom.service has finished successfully
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit earlyoom.service has finished successfully.
░░
░░ The job identifier is 3784.
May 16 18:00:41 leptup earlyoom[24316]: earlyoom 1.8.2
May 16 18:00:41 leptup earlyoom[24316]: /nix/store/cpl768r9vzf8zdyg8zh9lziab4ps3y42-earlyoom-1.8.2/bin/earlyoom: unrecognized option '--avoid '^(X|init|Xorg|ssh|gnome.*|ghostty|kwin)$''
May 16 18:00:41 leptup earlyoom[24316]: Try 'earlyoom --help' for more information.
May 16 18:00:41 leptup systemd[1]: earlyoom.service: Main process exited, code=exited, status=13/n/a
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ An ExecStart= process belonging to unit earlyoom.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 13.
May 16 18:00:41 leptup systemd[1]: earlyoom.service: Failed with result 'exit-code'.

also i have ran into memory issues in the past. even though i have 16Gb RAM and use 17ishGb swap. i tried system-oom killer. but i’d always kill the gnome-shell and i’ll be thrown back to the login screen. and then i experience a few too many artifacts and glitches. so i’d have to restart anyways. and neither does my work get saved.

I’m not sure what could be wrong here, but you can leave it out, and then add one item at a time.

1 Like

That looks like you need to write the arguments as separate strings, like "--avoid" "'^(X|init|Xorg|ssh|gnome.*|ghostty|kwin)$'", so Nix doesn’t escape the space and make it all one argument.

1 Like

Wow, this tool is a game changer.

I did a test last night: open about 50+ tabs on Firefox, and watch FreeTube videos all night. With a few seconds performances hiccups, it eventually found a consistent way to perform. Opening new tabs would create a bogged situation. But giving it a few seconds, videos and computer usage was fine.

My config uses zram swap & swapspace. The system I used sports a i5-7400 with 12GB of DDR4. Physical ram came close to maxing out, virtual memory increased a lot, but the system never crashed.

Now, I don’t know how more or less effective services.swapspace was in this scenario as I used them in concert. For the longest time, I’ve only used a zram swap.

How long has services.earlyoom been around?

I feel like this should be added into the official nixos wiki. It feels so essential to have.

1 Like

For servers, I have found that this combination of settings avoids basically all the usual problems:

boot.kernel.sysctl = {
    # It no longer makes sense to have these be large percentages of
    # all system RAM.
    "vm.admin_reserve_kbytes"                     = 131072;  # 0x20000 = 128M
    # These would ideally be tuned based on the speed at which
    # the system's persistent storage can sink random-offset writes.
    "vm.dirty_background_bytes"                   = 1048576; # 0x100000 = 1M
    "vm.dirty_bytes"                              = 2097152; # 0x200000 = 2M

    # Disable memory overcommit, allow total AS commit no more than
    # swap + 90% of phys RAM
    "vm.overcommit_memory"                        = 2;
    "vm.overcommit_ratio"                         = 90;
};

# oomd is not useful when overcommit is disabled   
systemd.oomd.enable = false;

This requires a swap partition with at least twice as much space as the system’s physical RAM to avoid fork failures. It will not actually get used unless something goes horribly wrong. Depending on your workload you may want to adjust dirty_background_bytes and dirty_bytes upward, but be cautious. System RAM getting filled with dirty data pages faster than the disk can absorb them is, in my experience, the #1 root cause of OOM conditions on servers.

(If you know a reliable way to measure “the speed at which the system’s persistent storage can sink random-offset writes”, please tell me. All the tools I’ve tried are statistically naive and/or specifically designed for plain old spinning rust with no software RAID or anything on top.)

I think these settings are also appropriate for well-nigh all desktop builds, but I haven’t comprehensively tested them on desktops.

1 Like

The person from this article indicates the following:

Of course, adding some swap space would help, but I prefer to avoid adding more swap as it’s terribly inefficient and only postpone the problem.

With that said, and if one was using services.earlyoom, should one disable zramSwapand/or services.swapspace?

Is earlyoomnot working effectively because it never gets triggered because swapspace dynamically expands virtual memory?

I’m just trying to figure out the best configuration in this scenario.

I would also think: would disabling these zramSwap and/or services.swapspace also support less “virtual memory” wear and tear on SSD storage, if I only use earlyoom?

reference: https://dataswamp.org/\~solene/2022-09-28-earlyoom.html

1 Like