Daniel Hiltgen 6c5ccb11f9 Revamp ROCm support

This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.

2024-03-07 10:36:50 -08:00

2.4 KiB

Raw Blame History

Ollama on Linux

Install

Install Ollama running this one-liner:

curl -fsSL https://ollama.com/install.sh | sh

AMD Radeon GPU support

While AMD has contributed the amdgpu driver upstream to the official linux kernel source, the version is older and may not support all ROCm features. We recommend you install the latest driver from https://www.amd.com/en/support/linux-drivers for best support of your Radeon GPU.

Manual install

Download the `ollama` binary

Ollama is distributed as a self-contained binary. Download it to a directory in your PATH:

sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama
sudo chmod +x /usr/bin/ollama

Adding Ollama as a startup service (recommended)

Create a user for Ollama:

sudo useradd -r -s /bin/false -m -d /usr/share/ollama ollama

Create a service file in /etc/systemd/system/ollama.service:

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3

[Install]
WantedBy=default.target

Then start the service:

sudo systemctl daemon-reload
sudo systemctl enable ollama

Install CUDA drivers (optional – for Nvidia GPUs)

Download and install CUDA.

Verify that the drivers are installed by running the following command, which should print details about your GPU:

nvidia-smi

Start Ollama

Start Ollama using systemd:

sudo systemctl start ollama

Update

Update ollama by running the install script again:

curl -fsSL https://ollama.com/install.sh | sh

Or by downloading the ollama binary:

sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama
sudo chmod +x /usr/bin/ollama

Viewing logs

To view logs of Ollama running as a startup service, run:

journalctl -u ollama

Uninstall

Remove the ollama service:

sudo systemctl stop ollama
sudo systemctl disable ollama
sudo rm /etc/systemd/system/ollama.service

Remove the ollama binary from your bin directory (either /usr/local/bin, /usr/bin, or /bin):

sudo rm $(which ollama)

Remove the downloaded models and Ollama service user and group:

sudo rm -r /usr/share/ollama
sudo userdel ollama
sudo groupdel ollama

2.4 KiB Raw Blame History Unescape Escape