6c5ccb11f9
This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information.
128 lines
2.4 KiB
Markdown
128 lines
2.4 KiB
Markdown
# Ollama on Linux
|
||
|
||
## Install
|
||
|
||
Install Ollama running this one-liner:
|
||
|
||
>
|
||
|
||
```bash
|
||
curl -fsSL https://ollama.com/install.sh | sh
|
||
```
|
||
|
||
## AMD Radeon GPU support
|
||
|
||
While AMD has contributed the `amdgpu` driver upstream to the official linux
|
||
kernel source, the version is older and may not support all ROCm features. We
|
||
recommend you install the latest driver from
|
||
https://www.amd.com/en/support/linux-drivers for best support of your Radeon
|
||
GPU.
|
||
|
||
## Manual install
|
||
|
||
### Download the `ollama` binary
|
||
|
||
Ollama is distributed as a self-contained binary. Download it to a directory in your PATH:
|
||
|
||
```bash
|
||
sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama
|
||
sudo chmod +x /usr/bin/ollama
|
||
```
|
||
|
||
### Adding Ollama as a startup service (recommended)
|
||
|
||
Create a user for Ollama:
|
||
|
||
```bash
|
||
sudo useradd -r -s /bin/false -m -d /usr/share/ollama ollama
|
||
```
|
||
|
||
Create a service file in `/etc/systemd/system/ollama.service`:
|
||
|
||
```ini
|
||
[Unit]
|
||
Description=Ollama Service
|
||
After=network-online.target
|
||
|
||
[Service]
|
||
ExecStart=/usr/bin/ollama serve
|
||
User=ollama
|
||
Group=ollama
|
||
Restart=always
|
||
RestartSec=3
|
||
|
||
[Install]
|
||
WantedBy=default.target
|
||
```
|
||
|
||
Then start the service:
|
||
|
||
```bash
|
||
sudo systemctl daemon-reload
|
||
sudo systemctl enable ollama
|
||
```
|
||
|
||
### Install CUDA drivers (optional – for Nvidia GPUs)
|
||
|
||
[Download and install](https://developer.nvidia.com/cuda-downloads) CUDA.
|
||
|
||
Verify that the drivers are installed by running the following command, which should print details about your GPU:
|
||
|
||
```bash
|
||
nvidia-smi
|
||
```
|
||
|
||
### Start Ollama
|
||
|
||
Start Ollama using `systemd`:
|
||
|
||
```bash
|
||
sudo systemctl start ollama
|
||
```
|
||
|
||
## Update
|
||
|
||
Update ollama by running the install script again:
|
||
|
||
```bash
|
||
curl -fsSL https://ollama.com/install.sh | sh
|
||
```
|
||
|
||
Or by downloading the ollama binary:
|
||
|
||
```bash
|
||
sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama
|
||
sudo chmod +x /usr/bin/ollama
|
||
```
|
||
|
||
## Viewing logs
|
||
|
||
To view logs of Ollama running as a startup service, run:
|
||
|
||
```bash
|
||
journalctl -u ollama
|
||
```
|
||
|
||
## Uninstall
|
||
|
||
Remove the ollama service:
|
||
|
||
```bash
|
||
sudo systemctl stop ollama
|
||
sudo systemctl disable ollama
|
||
sudo rm /etc/systemd/system/ollama.service
|
||
```
|
||
|
||
Remove the ollama binary from your bin directory (either `/usr/local/bin`, `/usr/bin`, or `/bin`):
|
||
|
||
```bash
|
||
sudo rm $(which ollama)
|
||
```
|
||
|
||
Remove the downloaded models and Ollama service user and group:
|
||
|
||
```bash
|
||
sudo rm -r /usr/share/ollama
|
||
sudo userdel ollama
|
||
sudo groupdel ollama
|
||
```
|