Could someone please share their configuration to get llama.cpp to offload layers to gpu (amdgpu/rocm)
This should be enough
in NixOS configuration mouile
{
services.ollama = {
enable = true;
acceleration = "rocm";
};
}
or package override
{ pkgs ? import <nixpkgs> {} }:
pkgs.ollama.override {
acceleration = "rocm";
}
note it might take a while to build ollama with acceleration
this is for ollama which is working fine with rocm, im looking to make llama.cpp work with rocm instead
that would be the following override
{ pkgs ? import <nixpkgs> {} }:
pkgs.llama-cpp.override {
rocmSupport = true;
}
so I tried this with the stable branch but did not work, what did work was getting the packages/dependecies from unstable-small branch instead , is this the same experience for you ?
let
unstableSmall = import <nixosUnstableSmall> { config = { allowUnfree = true; }; };
in
services.llama-cpp = {
enable = true;
package = unstableSmall.llama-cpp.override { rocmSupport = true; };
model = "/var/lib/llama-cpp/models/qwen2.5-coder-32b-instruct-q4_0.gguf";
host = "";
port = "";
extraFlags = [
"-ngl"
"64"
];
openFirewall = true;
};
I didn’t try to build the package. I found llama-cpp
package on search.nixos.org, went into the package sources and I noticed there is package argument rocmSupport
, thus override