Having a strange issue since upgrading a server from
21.05. Initially I was getting read-only filesystem errors with the logs, but I found some docs that mentioned needing to use a new
ReadWritePaths option in my config as of
20.09, so I fixed that and now enters my current issue.
First, I’ve made no serious changes to my config besides the aforementioned
ReadWritePaths addition. This setup has been working without issue for the past couple of years, and through at least one upgrade.
Second, the issue only occurs when I start nginx through systemd. If I start nginx manually then all is well and good; that is to say, I know there’s nothing wrong with the
nginx.conf that ultimately gets generated by nix on switching.
Here are the relevant details from logs, etc.:
- No errors are output when I start the service as usual
$ systemctl start nginx.service
- After starting the service, running status tells me the workers are exiting
$ systemctl status nginx.service ● nginx.service - Nginx Web Server Loaded: loaded (/nix/store/cyy4wdgm32v67gpgh6gnhxx8197v15h8-unit-nginx.service/nginx.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2021-06-05 21:38:36 UTC; 1min 56s ago Process: 395995 ExecStartPre=/nix/store/50difldgpz2h0qc4g4p8mbwkgy5ihzib-unit-script-nginx-pre-start/bin/nginx-pre-start (code=exited, status=0/SUCCESS) Main PID: 395997 (nginx) IP: 0B in, 0B out IO: 0B read, 0B written Tasks: 3 (limit: 2373) Memory: 52.6M CPU: 25.248s CGroup: /system.slice/nginx.service ├─395997 nginx: master process /nix/store/z0rqfwaw46hl0snzaiw8wzr1sxbkjqiw-nginx-1.21.0/bin/nginx -c /nix/store/jfd309xx69g4hs6x4p8fznkjh3lfjy2q-nginx.conf ├─398832 nginx: master process /nix/store/z0rqfwaw46hl0snzaiw8wzr1sxbkjqiw-nginx-1.21.0/bin/nginx -c /nix/store/jfd309xx69g4hs6x4p8fznkjh3lfjy2q-nginx.conf └─398833 nginx: master process /nix/store/z0rqfwaw46hl0snzaiw8wzr1sxbkjqiw-nginx-1.21.0/bin/nginx -c /nix/store/jfd309xx69g4hs6x4p8fznkjh3lfjy2q-nginx.conf Jun 05 21:40:30 hostname systemd-coredump: Process 398793 (nginx) of user 0 dumped core. Jun 05 21:40:30 hostname nginx: 2021/06/05 21:40:30 [alert] 395997#395997: worker process 398793 exited on signal 31 (core dumped) Jun 05 21:40:31 hostname systemd-coredump: Process 398804 (nginx) of user 0 dumped core. Jun 05 21:40:31 hostname nginx: 2021/06/05 21:40:31 [alert] 395997#395997: worker process 398804 exited on signal 31 (core dumped) Jun 05 21:40:31 hostname systemd-coredump: Process 398808 (nginx) of user 0 dumped core. Jun 05 21:40:31 hostname nginx: 2021/06/05 21:40:31 [alert] 395997#395997: worker process 398808 exited on signal 31 (core dumped) Jun 05 21:40:32 hostname systemd-coredump: Process 398823 (nginx) of user 0 dumped core. Jun 05 21:40:32 hostname systemd-coredump: Process 398816 (nginx) of user 0 dumped core. Jun 05 21:40:32 hostname nginx: 2021/06/05 21:40:32 [alert] 395997#395997: worker process 398823 exited on signal 31 (core dumped) Jun 05 21:40:32 hostname nginx: 2021/06/05 21:40:32 [alert] 395997#395997: worker process 398816 exited on signal 31 (core dumped)
- Here’s what nginx.service looks like (generated by nixos):
[Unit] After=network.target Description=Nginx Web Server StartLimitIntervalSec=60 [Service] Environment="LOCALE_ARCHIVE=/nix/store/in621vh2kj0ayqa6qc9pqnjvx6hzj5h5-glibc-locales-2.32-46/lib/locale/locale-archive" Environment="PATH=/nix/store/a4v1akahda85rl9gfphb07zzw79z8pb1-coreutils-8.32/bin:/nix/store/1hvm45djn8wkfg64gbmlqpfj4dnjh594-findutils-4.7.0/bin:/nix/store/7n3yzh9wza4bdqc04v01xddnfhkrwk2a-gnugrep-3.6/bin:/nix/store/g34ldykl1cal5b9ir3xinnq70m52fcnq-gnused-4.8/bin:/nix/store/r2bw74x7zci7shzxq3cikww9kp1wxc6i-systemd-247.6/bin:/nix/store/a4v1akahda85rl9gfphb07zzw79z8pb1-coreutils-8.32/sbin:/nix/store/1hvm45djn8wkfg64gbmlqpfj4dnjh594-findutils-4.7.0/sbin:/nix/store/7n3yzh9wza4bdqc04v01xddnfhkrwk2a-gnugrep-3.6/sbin:/nix/store/g34ldykl1cal5b9ir3xinnq70m52fcnq-gnused-4.8/sbin:/nix/store/r2bw74x7zci7shzxq3cikww9kp1wxc6i-systemd-247.6/sbin" Environment="TZDIR=/nix/store/y4j4k0l6w941wriprxz13dhvz896lw3m-tzdata-2020f/share/zoneinfo" X-StopIfChanged=false AmbientCapabilities=CAP_NET_BIND_SERVICE AmbientCapabilities=CAP_SYS_RESOURCE CacheDirectory=nginx CacheDirectoryMode=0750 CapabilityBoundingSet=CAP_NET_BIND_SERVICE CapabilityBoundingSet=CAP_SYS_RESOURCE ExecReload=/nix/store/z0rqfwaw46hl0snzaiw8wzr1sxbkjqiw-nginx-1.21.0/bin/nginx -c '/nix/store/jfd309xx69g4hs6x4p8fznkjh3lfjy2q-nginx.conf' -t ExecReload=/nix/store/a4v1akahda85rl9gfphb07zzw79z8pb1-coreutils-8.32/bin/kill -HUP $MAINPID ExecStart=/nix/store/z0rqfwaw46hl0snzaiw8wzr1sxbkjqiw-nginx-1.21.0/bin/nginx -c '/nix/store/jfd309xx69g4hs6x4p8fznkjh3lfjy2q-nginx.conf' ExecStartPre=/nix/store/50difldgpz2h0qc4g4p8mbwkgy5ihzib-unit-script-nginx-pre-start/bin/nginx-pre-start Group=root LockPersonality=true LogsDirectory=nginx LogsDirectoryMode=0750 MemoryDenyWriteExecute=true NoNewPrivileges=true PrivateDevices=true PrivateMounts=true PrivateTmp=true ProcSubset=pid ProtectClock=true ProtectControlGroups=true ProtectHome=true ProtectHostname=true ProtectKernelLogs=true ProtectKernelModules=true ProtectKernelTunables=true ProtectProc=invisible ProtectSystem=strict ReadWritePaths=/var/www/ ReadWritePaths=/run/ RemoveIPC=true Restart=always RestartSec=10s RestrictAddressFamilies=AF_UNIX RestrictAddressFamilies=AF_INET RestrictAddressFamilies=AF_INET6 RestrictNamespaces=true RestrictRealtime=true RestrictSUIDSGID=true RuntimeDirectory=nginx RuntimeDirectoryMode=0750 SystemCallArchitectures=native SystemCallFilter=~@cpu-emulation @debug @keyring @ipc @mount @obsolete @privileged @setuid UMask=0027 User=root
I don’t really know much about systemd, so I’m a little confused and lost at the moment. Like I mentioned before, nginx runs without any issues if I stop the service, then take the
ExecStart command from the above service file and run it directly.
$ systemctl stop nginx.service $ /nix/store/z0rqfwaw46hl0snzaiw8wzr1sxbkjqiw-nginx-1.21.0/bin/nginx -c '/nix/store/jfd309xx69g4hs6x4p8fznkjh3lfjy2q-nginx.conf'
At this point, I can access the sites that are served on this machine from any browser.
Other details that may or may not be important:
- The server has 2GB of RAM, with 1.4GB available. This hasn’t changed since I spun it up 2 years ago, and has never caused an issue.
- I’ve rebooted the server a couple times since
nixos-rebuild switch --upgrade, but no difference in behavior.
- I noticed an issue with
fail2banwhen I initially upgraded (it failed to start during the upgrade), but since a reboot it has been running without issue.
- No other errors encountered during the upgrade process.
- Server config is fairly basic, no
nixopsinvolved. I really only use nginx to proxy to Docker containers and sign certs.
/var/wwwis the path used to store pretty much all the certs, logs, and html files.
/run/I only added for the default pidfile, but I don’t think it’s actually needed. I get similar results if this is removed from the
ReadWritePathsoption in my config.
If anyone has any insight or thoughts, it’d be much appreciated. Otherwise, if I happen to make any headway I’ll update the thread with any relevant info.