ollama

Author	SHA1	Message	Date
Daniel Hiltgen	d966b730ac	Switch windows build to fully dynamic Refactor where we store build outputs, and support a fully dynamic loading model on windows so the base executable has no special dependencies thus doesn't require a special PATH.	2024-01-02 15:36:16 -08:00
Daniel Hiltgen	9a70aecccb	Refactor how we augment llama.cpp This changes the model for llama.cpp inclusion so we're not applying a patch, but instead have the C++ code directly in the ollama tree, which should make it easier to refine and update over time.	2024-01-02 15:35:55 -08:00
Jeffrey Morgan	d4ebdadbe7	enable `cache_prompt` by default	2023-12-27 14:23:42 -05:00
K0IN	10da41d677	Add Cache flag to api (#1642 )	2023-12-22 17:16:20 -05:00
Daniel Hiltgen	325d74985b	Fix CPU performance on hyperthreaded systems The default thread count logic was broken and resulted in 2x the number of threads as it should on a hyperthreading CPU resulting in thrashing and poor performance.	2023-12-21 16:23:36 -08:00
Daniel Hiltgen	d9cd3d9667	Revive windows build The windows native setup still needs some more work, but this gets it building again and if you set the PATH properly, you can run the resulting exe on a cuda system.	2023-12-20 17:21:54 -08:00
Daniel Hiltgen	7555ea44f8	Revamp the dynamic library shim This switches the default llama.cpp to be CPU based, and builds the GPU variants as dynamically loaded libraries which we can select at runtime. This also bumps the ROCm library to version 6 given 5.7 builds don't work on the latest ROCm library that just shipped.	2023-12-20 14:45:57 -08:00
Daniel Hiltgen	6558f94ed0	Fix darwin intel build	2023-12-19 13:32:24 -08:00
Daniel Hiltgen	1b991d0ba9	Refine build to support CPU only If someone checks out the ollama repo and doesn't install the CUDA library, this will ensure they can build a CPU only version	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	9adca7f711	Bump llama.cpp to b1662 and set n_parallel=1	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	35934b2e05	Adapted rocm support to cgo based llama.cpp	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	d4cd695759	Add cgo implementation for llama.cpp Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.	2023-12-19 09:05:46 -08:00

12 commits