ollama

History

Daniel Hiltgen 795674dd90 Bump llama.cpp to b1842 and add new cuda lib dep Upstream llama.cpp has added a new dependency with the NVIDIA CUDA Driver Libraries (libcuda.so) which is part of the driver distribution, not the general cuda libraries, and is not available as an archive, so we can not statically link it. This may introduce some additional compatibility challenges which we'll need to keep an eye on.		2024-01-16 12:53:52 -08:00
..
ext_server	Disable `mmap` with lora layers (#1985 )	2024-01-13 23:36:31 -05:00
generate	Bump llama.cpp to b1842 and add new cuda lib dep	2024-01-16 12:53:52 -08:00
llama.cpp@584d674be6	Bump llama.cpp to b1842 and add new cuda lib dep	2024-01-16 12:53:52 -08:00
dyn_ext_server.c	Always dynamically load the llm server library	2024-01-11 08:42:47 -08:00
dyn_ext_server.go	do not cache prompt (#2018 )	2024-01-16 13:48:05 -05:00
dyn_ext_server.h	Always dynamically load the llm server library	2024-01-11 08:42:47 -08:00
ggml.go	add max context length check	2024-01-12 14:54:07 -08:00
gguf.go	add max context length check	2024-01-12 14:54:07 -08:00
llama.go	remove unused fields and functions	2024-01-09 09:37:40 -08:00
llm.go	add max context length check	2024-01-12 14:54:07 -08:00
payload_common.go	Merge pull request #1935 from dhiltgen/cpu_fallback	2024-01-11 15:52:32 -08:00
payload_darwin.go	Always dynamically load the llm server library	2024-01-11 08:42:47 -08:00
payload_linux.go	Always dynamically load the llm server library	2024-01-11 08:42:47 -08:00
payload_test.go	Fix up the CPU fallback selection	2024-01-11 15:27:06 -08:00
payload_windows.go	Always dynamically load the llm server library	2024-01-11 08:42:47 -08:00
utils.go	partial decode ggml bin for more info	2023-08-10 09:23:10 -07:00