Skip to content

Instantly share code, notes, and snippets.

@cgmb
Last active April 17, 2024 23:27
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cgmb/be113c04cd740425f637aa33c3e4ea33 to your computer and use it in GitHub Desktop.
Save cgmb/be113c04cd740425f637aa33c3e4ea33 to your computer and use it in GitHub Desktop.
How to build llama.cpp on Debian
#!/bin/sh
# Build llama.cpp on Debian 13 or Ubuntu 23.10 and later
# Tested with `docker run -it --device=/dev/dri --device=/dev/kfd --security-opt seccomp=unconfined --volume $HOME:/mnt/home debian:sid`
apt -y update
apt -y upgrade
apt -y install git hipcc libhipblas-dev librocblas-dev cmake build-essential
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp/
git checkout b2110
CC=clang-15 CXX=clang++-15 cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1030" -DCMAKE_BUILD_TYPE=Release
make -j16 -C build
build/bin/main -ngl 32 --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -m ~/Downloads/dolphin-2.2.1-mistral-7b.Q5_K_M.gguf --prompt "Once upon a time"
@userbox020
Copy link

userbox020 commented Mar 4, 2024

sup bro, i try to run the git inside a docker container on ubuntu 22.04 but it just detect cpu

here my Dockerfile

# Using Debian Bullseye for better stability
FROM debian:bullseye

# Build argument for Clang version to make it flexible
ARG CLANG_VERSION=11

# Set non-interactive frontend to avoid prompts during build
ENV DEBIAN_FRONTEND=noninteractive

# Update system and install essential packages
RUN apt-get update && apt-get upgrade -y && apt-get install -y \
    git \
    cmake \
    build-essential \
    "clang-$CLANG_VERSION" \
    libomp-dev # OpenMP library, often used with Clang

# Clone the specific repo and checkout to the specified commit
RUN git clone https://github.com/ggerganov/llama.cpp.git /llama.cpp && \
    cd /llama.cpp && \
    git checkout b2110

# Build the project
RUN cd /llama.cpp && \
    CC="clang-$CLANG_VERSION" CXX="clang++-$CLANG_VERSION" cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1010;gfx1030" -DCMAKE_BUILD_TYPE=Release && \
    make -j$(nproc) -C build

# Set the working directory to the build directory
WORKDIR /llama.cpp/build

# Command to keep the container running (replace this with your desired command)
CMD ["tail", "-f", "/dev/null"]

and here my docker compose file

version: '3.8'
services:
  llama-builder:
    build:
      context: . # Assumes Dockerfile is in the same directory
      args:
        CLANG_VERSION: "11" # Set the Clang version here, adjust as necessary
    devices:
      - "/dev/dri:/dev/dri" # For GPU access, might not be necessary for all use cases
      - "/dev/kfd:/dev/kfd" # For AMD ROCm access, adjust if not using ROCm
    security_opt:
      - seccomp=unconfined
    volumes:
      - "$HOME:/mnt/home" # Mount home directory to access necessary files
      - "/media/500GB_HDD/Models/:/Models/"   #Models files
      

this is how i start the docker container sudo docker compose up --build

and then try to run a model like the follow

./bin/main -m /Models/openhermes-2.5-neural-chat-v3-3-slerp.Q8_0.gguf -p "Hi you how are you" -ngl 90 --no-mmap --numa

and its not offloading anything to gpu

can you give me a hand bro? i will appreciate

@cgmb
Copy link
Author

cgmb commented Mar 4, 2024

@userbox020, you need to apt-get install -y hipcc libhipblas-dev librocblas-dev and use a newer OS and compiler. I suggest FROM ubuntu:mantic and CLANG_VERSION=15. The necessary packages are not available for Debian Bullseye or clang-11. Those are too old.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment