Daniel Hiltgen 6c5ccb11f9 Revamp ROCm support

This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.

2024-03-07 10:36:50 -08:00

3.7 KiB

Raw Blame History

How to troubleshoot issues

Sometimes Ollama may not perform as expected. One of the best ways to figure out what happened is to take a look at the logs. Find the logs on Mac by running the command:

cat ~/.ollama/logs/server.log

On Linux systems with systemd, the logs can be found with this command:

journalctl -u ollama

When you run Ollama in a container, the logs go to stdout/stderr in the container:

docker logs <container-name>

(Use docker ps to find the container name)

If manually running ollama serve in a terminal, the logs will be on that terminal.

When you run Ollama on Windows, there are a few different locations. You can view them in the explorer window by hitting <cmd>+R and type in:

explorer %LOCALAPPDATA%\Ollama to view logs
explorer %LOCALAPPDATA%\Programs\Ollama to browse the binaries (The installer adds this to your user PATH)
explorer %HOMEPATH%\.ollama to browse where models and configuration is stored
explorer %TEMP% where temporary executable files are stored in one or more ollama* directories

To enable additional debug logging to help troubleshoot problems, first Quit the running app from the tray menu then in a powershell terminal

$env:OLLAMA_DEBUG="1"
& "ollama app.exe"

Join the Discord for help interpreting the logs.

LLM libraries

Ollama includes multiple LLM libraries compiled for different GPUs and CPU vector features. Ollama tries to pick the best one based on the capabilities of your system. If this autodetection has problems, or you run into other problems (e.g. crashes in your GPU) you can workaround this by forcing a specific LLM library. cpu_avx2 will perform the best, followed by cpu_avx an the slowest but most compatible is cpu. Rosetta emulation under MacOS will work with the cpu library.

In the server log, you will see a message that looks something like this (varies from release to release):

Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]

Experimental LLM Library Override

You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass autodetection, so for example, if you have a CUDA card, but want to force the CPU LLM library with AVX2 vector support, use:

OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve

You can see what features your CPU has with the following.

cat /proc/cpuinfo| grep flags  | head -1

AMD Radeon GPU Support

Ollama leverages the AMD ROCm library, which does not support all AMD GPUs. In some cases you can force the system to try to use a close GPU type. For example The Radeon RX 5400 is gfx1034 (also known as 10.3.4) however, ROCm does not support this patch-level, the closest support is gfx1030. You can use the environment variable HSA_OVERRIDE_GFX_VERSION with x.y.z syntax. So for example, to force the system to run on the RX 5400, you would set HSA_OVERRIDE_GFX_VERSION="10.3.0" as an environment variable for the server.

At this time, the known supported GPU types are the following: (This may change from release to release)

gfx900
gfx906
gfx908
gfx90a
gfx940
gfx941
gfx942
gfx1030
gfx1100
gfx1101
gfx1102

This will not work for all unsupported GPUs. Reach out on Discord or file an issue for additional help.

Installing older versions on Linux

If you run into problems on Linux and want to install an older version you can tell the install script which version to install.

curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION="0.1.27" sh

3.7 KiB Raw Blame History

How to troubleshoot issues

LLM libraries

AMD Radeon GPU Support

Installing older versions on Linux

Known issues

3.7 KiB

Raw Blame History