Jeffrey Morgan
c336693f07
calculate overhead based number of gpu devices ( #1875 )
2024-01-09 15:53:33 -05:00
Daniel Hiltgen
d74ce6bd4f
Detect very old CUDA GPUs and fall back to CPU
...
If we try to load the CUDA library on an old GPU, it panics and crashes
the server. This checks the compute capability before we load the
library so we can gracefully fall back to CPU mode.
2024-01-06 21:40:29 -08:00
Jeffrey Morgan
1caa56128f
add cuda lib path for nvidia container toolkit
2024-01-05 21:10:37 -05:00
Jeffrey Morgan
df32537312
gpu: read memory info from all cuda devices ( #1802 )
...
* gpu: read memory info from all cuda devices
* add `LOOKUP_SIZE` constant
* better constant name
* address comments
2024-01-05 11:25:58 -05:00
Daniel Hiltgen
a2ad952440
Fix windows system memory lookup
...
This refines the gpu package error handling and fixes a bug with the
system memory lookup on windows.
2024-01-03 08:50:01 -08:00
Daniel Hiltgen
1d1eb1688c
Additional nvidial-ml path to check
2023-12-19 15:52:34 -08:00
Daniel Hiltgen
5646826a79
Add WSL2 path to nvidia-ml.so library
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
1b991d0ba9
Refine build to support CPU only
...
If someone checks out the ollama repo and doesn't install the CUDA
library, this will ensure they can build a CPU only version
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
35934b2e05
Adapted rocm support to cgo based llama.cpp
2023-12-19 09:05:46 -08:00