Cannot get openai-whisper-cpp to work with CUDA

phaeseeKe5Ee · January 8, 2024, 12:19pm

Hi, I am a nixOS beginner for about a month.
I tested openai-whisper-cpp 1.5.2 (from unstable and 23.11), it works great with CPU.
I want to test it with CUDA GPU for speed.
I use a modified nix file, and add cudaPackages libcublas cudatoolkit in buildInputs and cuda_nvcc in nativeBuildInputs, also add env = { WHISPER_CUBLAS = "1"; }.
use ( callPackage ./path/to/modified/openai-whisper-cpp.nix { } ) in environment.systemPackages to install it.
It builds fine and confirmed also run with CUDA GPU. (I use nvidia-smi -l 1 to test its usage)

However the bug in 1.5.2 make it generate no output with CUDA, so I use the latest 1.5.4 instead.
But no matter what I try, I end up with an error code while running nixos-rebuild switch

/nix/store/HHHHAAAASSSSHHHH-binutils-2.40/bin/ld: cannot find -lcuda: No such file or directory

I also referenced cuda related code in llama-cpp and put almost all of it in openai-whisper-cpp, it also does not build with the exact error.
This problem only shows in 1.5.4, but not 1.5.2.
I compared the Makefile of both version, only 1.5.4 have the -lcuda in LDFLAGS.

I am not a programmer so I do not know what to do next to fix it.

Solene · January 8, 2024, 12:59pm

I got this problem in the past

phaeseeKe5Ee · January 9, 2024, 12:38am

Thanks for your relpy!
But this workaround is for openai-whisper only.
I played with the nix file of openai-whisper-cpp few more hours but still cannot get it to build with CUDA.
I will give openai-whisper a try, seems it is the easiest way to use CUDA.

phaeseeKe5Ee · January 9, 2024, 1:07am

I am testing openai-whisper now.
When I use the one-liner override, I got this error:

 > Found duplicated packages in closure for dependency 'triton':
       >  triton 2.0.0 (/nix/store/xsynvsl5jf2ql6d5djroq4cvrjzmh9z8-python3.11-triton-2.0.0/lib/python3.11/site-packages/triton-2.0.0.dist-info)
       >         triton 2.0.0 (/nix/store/0fk4tidjg50wj5ppjl3fv1npjj80fl0b-python3.11-triton-2.0.0/lib/python3.11/site-packages/triton-2.0.0.dist-info)
       >
       > Package duplicates found in closure, see above. Usually this happens if two packages depend on different version of the same dependency.

phaeseeKe5Ee · January 9, 2024, 9:59am

After doing some test, I can finally get openai-whisper-cpp to work with CUDA.
Used a command rg -- -lcuda in nixpkgs repo and find a similar package openai-triton which use substituteInPlace to substitute some paths.
And I also found this post useful.

For my machine, The problem is solved by replacing "-lcuda " with "-lcuda -L${pkgs.linuxPackages.nvidia_x11}/lib " in Makefile with the substituteInPlace command, then it can be bulit with no error. I will post an overlay here later.

phaeseeKe5Ee · January 9, 2024, 11:11am

The overlay for openai-whisper-cpp 1.5.4 for 23.11 working with CUDA:

(final: prev:{
	openai-whisper-cpp = prev.openai-whisper-cpp.overrideAttrs  (o: rec {
		version = "1.5.4";
		src = prev.fetchFromGitHub {
			owner = "ggerganov";
			repo = "whisper.cpp";
			rev = "refs/tags/v${version}" ;
			hash = "sha256-9H2Mlua5zx2WNXbz2C5foxIteuBgeCNALdq5bWyhQCk=";
		};
		patches = [ ./path/to/the/download-models-1.5.4.patch ];
		env = o.env // { WHISPER_CUBLAS = "1"; };
		nativeBuildInputs = (o.nativeBuildInputs or [] ) ++ ( with pkgs.cudaPackages; [
			cuda_nvcc
		]);
		buildInputs = (o.buildInputs or [] ) ++ ( with pkgs.cudaPackages; [
			cudatoolkit
			libcublas
			cuda_cudart
		]);
		postPatch = let
			oldStr = "-lcuda ";
			newStr = "-lcuda -L${pkgs.linuxPackages.nvidia_x11}/lib ";
		in ''
			substituteInPlace Makefile \
			--replace '${oldStr}' '${newStr}'
		'';
	});
})

The patch file download-models-1.5.4.patch:

--- a/models/download-ggml-model.sh
+++ b/models/download-ggml-model.sh
@@ -9,18 +9,6 @@
 src="https://huggingface.co/ggerganov/whisper.cpp"
 pfx="resolve/main/ggml"
 
-# get the path of this script
-function get_script_path() {
-    if [ -x "$(command -v realpath)" ]; then
-        echo "$(dirname "$(realpath "$0")")"
-    else
-        local ret="$(cd -- "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P)"
-        echo "$ret"
-    fi
-}
-
-models_path="${2:-$(get_script_path)}"
-
 # Whisper models
 models=(
     "tiny.en"
@@ -82,8 +70,6 @@
 
 printf "Downloading ggml model $model from '$src' ...\n"
 
-cd "$models_path"
-
 if [ -f "ggml-$model.bin" ]; then
     printf "Model $model already exists. Skipping download.\n"
     exit 0
@@ -105,7 +91,7 @@
     exit 1
 fi
 
-printf "Done! Model '$model' saved in '$models_path/ggml-$model.bin'\n"
+printf "Done! Model '$model' saved in 'ggml-$model.bin'\n"
 printf "You can now use it like this:\n\n"
-printf "  $ ./main -m $models_path/ggml-$model.bin -f samples/jfk.wav\n"
+printf "  $ whisper-cpp -m ggml-$model.bin -f samples/jfk.wav\n"
 printf "\n"

If I have time I will make a PR to make it possible for one liner override like ( openai-whisper-cpp.override { cudaSupport = true; } ).

SergeK · January 9, 2024, 4:32pm

Hi! Great progress! Note that nvidia_x11 isn’t meant to be linked to directly, because its libcuda.so is only compatible with the .ko from the exact same nvidia_x11 revision.

At build time one may link ${cudaPackages.cuda_cudart.lib}/lib/stubs/libcuda.so which is the fake driver library. Use the addDriverRunpath hook to extend the final binary’s DT_RUNPATHs in postFixup. At runtime it’ll use /run/opengl-driver/lib/libcuda.so on NixOS (by means of addDriverRunapth) and LD_LIBRARY_PATH on generic FHS distributions

Also avoid mixing cudatoolkit with the normal cuda libraries (cuda_cudart, libcublas, etc). If you get an error when you remove cudatoolkit, it means there’s another component missing

(yes, yes, the documentation needs updating)

Thx

phaeseeKe5Ee · January 11, 2024, 5:21pm

Thanks for your advice!
Now I am changing it to -L${cudaPackages.cuda_cudart.lib}/lib/stubs/libcuda.so

However I cannot solve this error

./main: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory

I think I have to somehow append .1 in the filename of libcuda.so, but I don’t know how to do that.
I tried ln -s ${cudaPackages.cuda_cudart.lib}/lib/stubs/libcuda.so ${cudaPackages.cuda_cudart.lib}/lib/stubs/libcuda.so.1 but no luck

SergeK · January 11, 2024, 7:48pm

Oh that was a typo, -L takes a directory, so drop the /libcuda.so suffix

phaeseeKe5Ee · January 12, 2024, 8:34am

Already tried that, same error message .
I used find /nix/store -maxdepth 1 -type d -iname '*cudart*' to find the folder, and in /nix/store/somehash-cuda_cudart-11.8.89-lib/lib/stubs/, only libcuda.so exists, not libcuda.so.1. I think it is the cause of the error .

SergeK · January 12, 2024, 9:20am

Sounds like an old nixpkgs revision, there should be a libcuda.so.1 symlink on master

phaeseeKe5Ee · January 12, 2024, 4:09pm

YES! I misconfigured some nixpkgs-unstable related variables in the flake.nix and configuration.nix, now it builds with no complain!

gaidenko · February 7, 2025, 8:46am

Thanks this worked really well!