ollama/gpu
Daniel Hiltgen 354ad9254e Wait for GPU free memory reporting to converge
The GPU drivers take a while to update their free memory reporting, so we need
to wait until the values converge with what we're expecting before proceeding
to start another runner in order to get an accurate picture.
2024-05-09 14:56:01 -07:00
..
amd_common.go Support Fedoras standard ROCm location 2024-05-01 15:47:12 -07:00
amd_hip_windows.go Record more GPU information 2024-05-09 14:18:14 -07:00
amd_linux.go Record more GPU information 2024-05-09 14:18:14 -07:00
amd_windows.go Record more GPU information 2024-05-09 14:18:14 -07:00
assets.go Centralize server config handling 2024-05-05 16:49:50 -07:00
cpu_common.go Wait for GPU free memory reporting to converge 2024-05-09 14:56:01 -07:00
cuda_common.go Request and model concurrency 2024-04-22 19:29:12 -07:00
gpu.go Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_darwin.go llm: add minimum based on layer size 2024-05-06 17:04:19 -07:00
gpu_info.h Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_info_cpu.c Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_info_cudart.c Request and model concurrency 2024-04-22 19:29:12 -07:00
gpu_info_cudart.h Add CUDA Driver API for GPU discovery 2024-04-30 18:00:45 -07:00
gpu_info_darwin.h darwin: no partial offloading if required memory greater than system 2024-04-16 11:22:38 -07:00
gpu_info_darwin.m darwin: no partial offloading if required memory greater than system 2024-04-16 11:22:38 -07:00
gpu_info_nvcuda.c Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_info_nvcuda.h Record more GPU information 2024-05-09 14:18:14 -07:00
gpu_test.go Request and model concurrency 2024-04-22 19:29:12 -07:00
types.go Record more GPU information 2024-05-09 14:18:14 -07:00