Hi, I am a nixOS beginner for about a month.
I tested openai-whisper-cpp 1.5.2 (from unstable and 23.11), it works great with CPU.
I want to test it with CUDA GPU for speed.
I use a modified nix file, and add cudaPackages libcublascudatoolkit in buildInputs and cuda_nvcc in nativeBuildInputs, also add env = { WHISPER_CUBLAS = "1"; }.
use ( callPackage ./path/to/modified/openai-whisper-cpp.nix { } ) in environment.systemPackages to install it.
It builds fine and confirmed also run with CUDA GPU. (I use nvidia-smi -l 1 to test its usage)
However the bug in 1.5.2 make it generate no output with CUDA, so I use the latest 1.5.4 instead.
But no matter what I try, I end up with an error code while running nixos-rebuild switch
/nix/store/HHHHAAAASSSSHHHH-binutils-2.40/bin/ld: cannot find -lcuda: No such file or directory
I also referenced cuda related code in llama-cpp and put almost all of it in openai-whisper-cpp, it also does not build with the exact error.
This problem only shows in 1.5.4, but not 1.5.2.
I compared the Makefile of both version, only 1.5.4 have the -lcuda in LDFLAGS.
I am not a programmer so I do not know what to do next to fix it.
Thanks for your relpy!
But this workaround is for openai-whisper only.
I played with the nix file of openai-whisper-cpp few more hours but still cannot get it to build with CUDA.
I will give openai-whisper a try, seems it is the easiest way to use CUDA.
I am testing openai-whisper now.
When I use the one-liner override, I got this error:
> Found duplicated packages in closure for dependency 'triton':
> triton 2.0.0 (/nix/store/xsynvsl5jf2ql6d5djroq4cvrjzmh9z8-python3.11-triton-2.0.0/lib/python3.11/site-packages/triton-2.0.0.dist-info)
> triton 2.0.0 (/nix/store/0fk4tidjg50wj5ppjl3fv1npjj80fl0b-python3.11-triton-2.0.0/lib/python3.11/site-packages/triton-2.0.0.dist-info)
>
> Package duplicates found in closure, see above. Usually this happens if two packages depend on different version of the same dependency.
After doing some test, I can finally get openai-whisper-cpp to work with CUDA.
Used a command rg -- -lcuda in nixpkgs repo and find a similar package openai-triton which use substituteInPlace to substitute some paths.
And I also found this post useful.
For my machine, The problem is solved by replacing "-lcuda " with "-lcuda -L${pkgs.linuxPackages.nvidia_x11}/lib " in Makefile with the substituteInPlace command, then it can be bulit with no error. I will post an overlay here later.
Hi! Great progress! Note that nvidia_x11 isn’t meant to be linked to directly, because its libcuda.so is only compatible with the .ko from the exact same nvidia_x11 revision.
At build time one may link ${cudaPackages.cuda_cudart.lib}/lib/stubs/libcuda.so which is the fake driver library. Use the addDriverRunpath hook to extend the final binary’s DT_RUNPATHs in postFixup. At runtime it’ll use /run/opengl-driver/lib/libcuda.so on NixOS (by means of addDriverRunapth) and LD_LIBRARY_PATH on generic FHS distributions
Also avoid mixing cudatoolkit with the normal cuda libraries (cuda_cudart, libcublas, etc). If you get an error when you remove cudatoolkit, it means there’s another component missing
Thanks for your advice!
Now I am changing it to -L${cudaPackages.cuda_cudart.lib}/lib/stubs/libcuda.so
However I cannot solve this error
./main: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory
I think I have to somehow append .1 in the filename of libcuda.so, but I don’t know how to do that.
I tried ln -s ${cudaPackages.cuda_cudart.lib}/lib/stubs/libcuda.so ${cudaPackages.cuda_cudart.lib}/lib/stubs/libcuda.so.1 but no luck
Already tried that, same error message .
I used find /nix/store -maxdepth 1 -type d -iname '*cudart*' to find the folder, and in /nix/store/somehash-cuda_cudart-11.8.89-lib/lib/stubs/, only libcuda.so exists, not libcuda.so.1. I think it is the cause of the error .