NixOs running out of memory

jokob-sk · January 16, 2025, 9:29pm

Hi there!

This has happened now a few times and I’m unable to recover from this unless I hard reboot the server. Any advice how to prevent this or at least recover from this remotely?

[766253.179498] CIFS: VFS: \\192.168.1.82\Plex Close cancelled mid failed rc:-9 
 [766253.182240] CIFS: IFS: \\192.168.1.82\Plex Close cancelled mid failed rc:-9 
 [766253.183155] CIFS: VFS: \\192.168.1.82\Plex Close cancelled mid failed rc:-9 
 [767570.539969] CIFS: VFS: \\192.168.1.82\Plex Close interrupted close 
 [767570.540656] CIFS: VFS: \\192.168.1.82\Plex Close unmatched open for MID:32685191 
 [767570.541628] CIFS: VFS: \\192.168.1.82\Plex Close unmatched open for MID:32685193 
 [767570.542984] CIFS: IFS: \\192.168.1.82\Plex Close unmatched open for MID:32685194 
 [767578.544587] CIFS: VFS: \\192.168.1.82\Plex Close unmatched open for MID:32685195 
 [767570.545118] CIFS: VFS: \\192.168.1.82\Plex Close unmatched open for MID:3=n 
 [767570.546012] CIFS: VFS: \\192.168.1.82\Plex Close unmatched open for #10:32685198 
 [767570.547509] CIFS: VFS: \\192.168.1.82\Plex Close cancelled mid failed rc: 
 [801834.727462] CIFS: VFS: \\192.168.1.82\Plex Close interrupted close 
 [801834.727548] CIFS: VFS: \\192.168.1.82\Plex Close interrupted close 
 [801834.727859] CIFS: VFS: \\192.168.1.82\Plex Close unmatched open for MID:34384574 
 [801834.728819] CIFS: VFS: \\192.168.1.82\Plex Close unmatched open for MID:34384578 
 [801834.729883] CIFS: VFS: \\192.168.1.82\Plex Close cancelled mid failed rc:-9 
 [801834.731746] CIFS: VFS: \\192.168.1.82\Plex Close cancelled mid failed rc. 9 
 [958554.360809] Out of memory: Killed process 2047486 (Isolated Web Co) total-vm:3100332kB, anon-rss:352168kB, file-rss:484kB, shmem-rss:1152kB, 610:1000 pgtables:2292kB oom_score_adj:167

fstab if relevant

# This is a generated file.  Do not edit!
#
# To make changes, edit the fileSystems and swapDevices NixOS options
# in your /etc/nixos/configuration.nix file.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>

# Filesystems.
/dev/disk/by-uuid/f9b85e7b-2d2b-4787-a6bc-6d7a70d2c197 / ext4 x-initrd.mount 0 1
/dev/disk/by-uuid/50A5-A97D /boot vfat fmask=0077,dmask=0077 0 2
//192.168.1.82/Backup /mnt/backup cifs username=nuc,password=<some pwd>,noperm 0 0
//192.168.1.82/docker_appdata /mnt/docker_appdata cifs username=nuc,password=<some pwd>,noperm 0 0
//192.168.1.82/Plex /mnt/plex cifs username=nuc,password=<some pwd>,noperm 0 0
//192.168.1.82/Syncthing /mnt/syncthing cifs username=nuc,password=<some pwd>,noperm 0 0


# Swap devices.
/dev/disk/by-uuid/ea79395b-4b93-413c-bd5a-5d3d83008ae2 none swap defaults

htop (normal operation)

Unfortunately I couldn’t SSH into the system to get stats before the reboot.

Thanks for any pointers in advance,
j

Tmplt · January 16, 2025, 9:39pm

A swapfile can help, and perhaps the zram kernel module, but filling 50G of memory at such low normal operation indicates larger issues. Perhaps a slow-but-steady memory-leak is present, and the CIFS errors are a symptom, but not necessarily the cause?

jokob-sk · January 16, 2025, 9:47pm

Thanks a lot @Tmplt for the reply.

My thoughts exactly - 50GB of memory should give plenty of headroom. Should I write a cron script to log memory usage every few minutes so I can have more data after the crash? Or do you have other tips how to monitor this?

Looking at zramswap I don’t think this would help in the long term, as it would only delay the OOM issue if there is a memory leak somewhere.

Appreciate the help.

Atemu · January 17, 2025, 12:26am

How frequently and how suddenly does this occur?

If it doesn’t happen too often or fast, I’d just monitor memory usage by spot-checking and seeing whether it has grown and by how much.

That’s firefox. What’s that doing on a server?

Web pages can very well cause an explosion in RAM usage. I’ve seen badly behaved ones taking gigabytes per tab before.

jokob-sk · January 17, 2025, 11:18pm

Thanks for the reply.

I set up Beszel on it for some basic monitoring and configured usage alerts.

Its a mix use server - the freezes happened even without me logging into a session and opening FF, but let’s see if I can pinpoint the issue further.