I get this weird error when trying to run:
$ nvidia-offload mangohud glxgears
[2025-02-19 03:38:09.554] [MANGOHUD] [error] [loader_nvml.cpp:42] Failed to open 64bit libnvidia-ml.so.1: libnvidia-ml.so.1: cannot open shared object file: No such file or directory
Segmentation fault (core dumped)
Upon checking the journalctl
I notice the following error:
$ journalctl --since "1 min ago"
Feb 19 02:49:18 nixos kernel: glxgears[4744]: segfault at 1ec67c28 ip 000000001ec67c28 sp 00007ffcefc18268 err>
Feb 19 02:49:18 nixos kernel: Code: a8 41 00 40 2c 3f 00 60 0b 3f 00 00 3b 3f 00 80 0d 3f ba cb 03 00 ab 28 96>
Feb 19 02:49:18 nixos systemd-coredump[4778]: Process 4744 (glxgears) of user 1000 terminated abnormally with >
Feb 19 02:49:18 nixos systemd[1]: Started Process Core Dump (PID 4778/UID 0).
Feb 19 02:49:18 nixos systemd-timesyncd[688]: Contacted time server [2606:4700:f1::1]:123 (2.nixos.pool.ntp.or>
Feb 19 02:49:18 nixos systemd-coredump[4779]: [🡕] Process 4744 (glxgears) of user 1000 dumped core.
Module libstdc++.so.6 without build-id.
Module libffi.so.8 without build-id.
Module libgcc_s.so.1 without build-id.
Module libfmt.so.10 without build-id.
Module libspdlog.so.1.15 without build-id.
Module libxkbcommon.so.0 without build-id.
Module libMangoHud_opengl.so without build-id.
Module libxcb-sync.so.1 without build-id.
Module libxcb-present.so.0 without build-id.
Module libX11-xcb.so.1 without build-id.
Module libxcb-dri3.so.0 without build-id.
edit#1
I also noticed that this is causing issue with running anything that uses those library, like python package torch
when running ‘stable-diffusion-webui’:
$ nvidia-offload ./webui.sh
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
# trying out the --skip-torch-cuda-test option
$ nvidia-offload ./webui.sh --skip-torch-cuda-test
OSError: libstdc++.so.6: cannot open shared object file: No such file or directory
(Edit#1: Hiding this section as I solved this part. I deleted venv
folder, rebuilt shell.nix
and that fixed the issue. I made it first before upgrading my system with sudo nixos-rebuild --flake .
)
I am not sure what could be causing this issue and I don’t seem to find anything so far that is relevant to this… Can someone help me out with this issue?
Thank you~