ollama/gpu
Daniel Hiltgen 6fd04ca922 Improve multi-gpu handling at the limit
Still not complete, needs some refinement to our prediction to understand the
discrete GPUs available space so we can see how many layers fit in each one
since we can't split one layer across multiple GPUs we can't treat free space
as one logical block
2024-06-14 14:51:40 -07:00
..
amd_common.go Support Fedoras standard ROCm location 2024-05-01 15:47:12 -07:00
amd_hip_windows.go Record more GPU information 2024-05-09 14:18:14 -07:00
amd_linux.go Improve multi-gpu handling at the limit 2024-06-14 14:51:40 -07:00
amd_windows.go Refine GPU discovery to bootstrap once 2024-06-14 14:51:40 -07:00
assets.go lint 2024-06-04 11:13:30 -07:00
cpu_common.go Refine GPU discovery to bootstrap once 2024-06-14 14:51:40 -07:00
cuda_common.go lint linux 2024-06-04 11:13:30 -07:00
gpu.go Improve multi-gpu handling at the limit 2024-06-14 14:51:40 -07:00
gpu_darwin.go Bump VRAM buffer back up 2024-05-10 09:15:28 -07:00
gpu_info.h support ollama run on Intel GPUs 2024-05-24 11:18:27 +08:00
gpu_info_cpu.c Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_info_cudart.c Refine GPU discovery to bootstrap once 2024-06-14 14:51:40 -07:00
gpu_info_cudart.h Refine GPU discovery to bootstrap once 2024-06-14 14:51:40 -07:00
gpu_info_darwin.h darwin: no partial offloading if required memory greater than system 2024-04-16 11:22:38 -07:00
gpu_info_darwin.m darwin: no partial offloading if required memory greater than system 2024-04-16 11:22:38 -07:00
gpu_info_nvcuda.c Refine GPU discovery to bootstrap once 2024-06-14 14:51:40 -07:00
gpu_info_nvcuda.h Refine GPU discovery to bootstrap once 2024-06-14 14:51:40 -07:00
gpu_info_oneapi.c support ollama run on Intel GPUs 2024-05-24 11:18:27 +08:00
gpu_info_oneapi.h support ollama run on Intel GPUs 2024-05-24 11:18:27 +08:00
gpu_oneapi.go support ollama run on Intel GPUs 2024-05-24 11:18:27 +08:00
gpu_test.go lint 2024-06-04 11:13:30 -07:00
types.go Improve multi-gpu handling at the limit 2024-06-14 14:51:40 -07:00