Add docs for GPU selection and nvidia uvm workaround

2024-03-21 11:17:19 +01:00 · 2024-03-21 11:17:19 +01:00 · d8fdbfd8da
commit d8fdbfd8da
parent a5ba0fcf78
2 changed files with 28 additions and 10 deletions
--- a/docs/faq.md
+++ b/docs/faq.md
@ -228,13 +228,3 @@ To unload the model and free up memory use:
 ```shell
 curl http://localhost:11434/api/generate -d '{"model": "llama2", "keep_alive": 0}'
 ```
-
-## Controlling which GPUs to use
-
-By default, on Linux and Windows, Ollama will attempt to use Nvidia GPUs, or
-Radeon GPUs, and will use all the GPUs it can find. You can limit which GPUs
-will be utilized by setting the environment variable `CUDA_VISIBLE_DEVICES` for
-NVIDIA cards, or `HIP_VISIBLE_DEVICES` for Radeon GPUs to a comma delimited list
-of GPU IDs.  You can see the list of devices with GPU tools such as `nvidia-smi` or
-`rocminfo`. You can set to an invalid GPU ID (e.g., "-1") to bypass the GPU and
-fallback to CPU.
--- a/docs/gpu.md
+++ b/docs/gpu.md
@ -29,6 +29,21 @@ Check your compute compatibility to see if your card is supported:
 |                    | Quadro              | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M`  |


+### GPU Selection
+
+If you have multiple NVIDIA GPUs in your system and want to limit Ollama to use
+a subset, you can set `CUDA_VISIBLE_DEVICES` to a comma separated list of GPUs.
+Numeric IDs may be used, however ordering may vary, so UUIDs are more reliable.
+You can discover the UUID of your GPUs by running `nvidia-smi -L` If you want to
+ignore the GPUs and force CPU usage, use an invalid GPU ID (e.g., "-1")
+
+### Laptop Suspend Resume
+
+On linux, after a suspend/resume cycle, sometimes Ollama will fail to discover
+your NVIDIA GPU, and fallback to running on the CPU.  You can workaround this
+driver bug by reloading the NVIDIA UVM driver with `sudo rmmod nvidia_uvm &&
+sudo modprobe nvidia_uvm`
+
 ## AMD Radeon
 Ollama supports the following AMD GPUs:
 | Family         | Cards and accelerators                                                                                                               |
@ -70,5 +85,18 @@ future release which should increase support for more GPUs.
 Reach out on [Discord](https://discord.gg/ollama) or file an
 [issue](https://github.com/ollama/ollama/issues) for additional help.

+### GPU Selection
+
+If you have multiple AMD GPUs in your system and want to limit Ollama to use a
+subset, you can set `HIP_VISIBLE_DEVICES` to a comma separated list of GPUs.
+You can see the list of devices with `rocminfo`.  If you want to ignore the GPUs
+and force CPU usage, use an invalid GPU ID (e.g., "-1")
+
+### Container Permission
+
+In some Linux distributions, SELinux can prevent containers from
+accessing the AMD GPU devices.  On the host system you can run 
+`sudo setsebool container_use_devices=1` to allow containers to use devices.
+
 ### Metal (Apple GPUs)
 Ollama supports GPU acceleration on Apple devices via the Metal API.