ollama/gpu
Daniel Hiltgen 58d95cc9bd Switch back to subprocessing for llama.cpp
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems.  This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
..
amd_common.go Fix iGPU detection for linux 2024-03-12 16:57:19 -07:00
amd_hip_windows.go Revamp ROCm support 2024-03-07 10:36:50 -08:00
amd_linux.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
amd_windows.go Finish unwinding idempotent payload logic 2024-03-09 08:34:39 -08:00
assets.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
cpu_common.go Mechanical switch from log to slog 2024-01-18 14:12:57 -08:00
gpu.go update memory calcualtions 2024-04-01 13:16:32 -07:00
gpu_darwin.go Allow setting max vram for workarounds 2024-03-06 17:15:06 -08:00
gpu_info.h add support for libcudart.so for CUDA devices (adds Jetson support) 2024-03-25 11:07:44 -04:00
gpu_info_cpu.c calculate overhead based number of gpu devices (#1875) 2024-01-09 15:53:33 -05:00
gpu_info_cudart.c add support for libcudart.so for CUDA devices (adds Jetson support) 2024-03-25 11:07:44 -04:00
gpu_info_cudart.h add support for libcudart.so for CUDA devices (adds Jetson support) 2024-03-25 11:07:44 -04:00
gpu_info_darwin.h Determine max VRAM on macOS using recommendedMaxWorkingSetSize (#2354) 2024-02-25 18:16:45 -05:00
gpu_info_darwin.m Determine max VRAM on macOS using recommendedMaxWorkingSetSize (#2354) 2024-02-25 18:16:45 -05:00
gpu_info_nvml.c add support for libcudart.so for CUDA devices (adds Jetson support) 2024-03-25 11:07:44 -04:00
gpu_info_nvml.h add support for libcudart.so for CUDA devices (adds Jetson support) 2024-03-25 11:07:44 -04:00
gpu_test.go Merge pull request #1819 from dhiltgen/multi_variant 2024-01-11 14:00:48 -08:00
types.go update memory calcualtions 2024-04-01 13:16:32 -07:00