ollama

History

Jeffrey Morgan b24e8d17b2 Increase minimum CUDA memory allocation overhead and fix minimum overhead for multi-gpu (#1896 ) * increase minimum cuda overhead and fix minimum overhead for multi-gpu * fix multi gpu overhead * limit overhead to 10% of all gpus * better wording * allocate fixed amount before layers * fixed only includes graph alloc		2024-01-10 19:08:51 -05:00
..
gpu.go	Increase minimum CUDA memory allocation overhead and fix minimum overhead for multi-gpu (#1896 )	2024-01-10 19:08:51 -05:00
gpu_darwin.go	calculate overhead based number of gpu devices (#1875 )	2024-01-09 15:53:33 -05:00
gpu_info.h	calculate overhead based number of gpu devices (#1875 )	2024-01-09 15:53:33 -05:00
gpu_info_cpu.c	calculate overhead based number of gpu devices (#1875 )	2024-01-09 15:53:33 -05:00
gpu_info_cuda.c	Harden GPU mgmt library lookup	2024-01-10 15:06:41 -08:00
gpu_info_cuda.h	Harden GPU mgmt library lookup	2024-01-10 15:06:41 -08:00
gpu_info_rocm.c	Harden GPU mgmt library lookup	2024-01-10 15:06:41 -08:00
gpu_info_rocm.h	Harden GPU mgmt library lookup	2024-01-10 15:06:41 -08:00
gpu_test.go	calculate overhead based number of gpu devices (#1875 )	2024-01-09 15:53:33 -05:00
types.go	calculate overhead based number of gpu devices (#1875 )	2024-01-09 15:53:33 -05:00