ollama/gpu
Daniel Hiltgen 30a7d7096c Bump VRAM buffer back up
Under stress scenarios we're seeing OOMs so this should help stabilize
the allocations under heavy concurrency stress.
2024-05-10 09:15:28 -07:00
..
amd_common.go Support Fedoras standard ROCm location 2024-05-01 15:47:12 -07:00
amd_hip_windows.go Record more GPU information 2024-05-09 14:18:14 -07:00
amd_linux.go Record more GPU information 2024-05-09 14:18:14 -07:00
amd_windows.go Record more GPU information 2024-05-09 14:18:14 -07:00
assets.go Centralize server config handling 2024-05-05 16:49:50 -07:00
cpu_common.go Wait for GPU free memory reporting to converge 2024-05-09 14:56:01 -07:00
cuda_common.go Request and model concurrency 2024-04-22 19:29:12 -07:00
gpu.go Bump VRAM buffer back up 2024-05-10 09:15:28 -07:00
gpu_darwin.go Bump VRAM buffer back up 2024-05-10 09:15:28 -07:00
gpu_info.h Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_info_cpu.c Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_info_cudart.c Request and model concurrency 2024-04-22 19:29:12 -07:00
gpu_info_cudart.h Add CUDA Driver API for GPU discovery 2024-04-30 18:00:45 -07:00
gpu_info_darwin.h darwin: no partial offloading if required memory greater than system 2024-04-16 11:22:38 -07:00
gpu_info_darwin.m darwin: no partial offloading if required memory greater than system 2024-04-16 11:22:38 -07:00
gpu_info_nvcuda.c Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_info_nvcuda.h Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_test.go Request and model concurrency 2024-04-22 19:29:12 -07:00
types.go Record more GPU information 2024-05-09 14:18:14 -07:00