ollama

Author	SHA1	Message	Date
Daniel Hiltgen	58d95cc9bd	Switch back to subprocessing for llama.cpp This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.	2024-04-01 16:48:18 -07:00
Jeremy	dfc6721b20	add support for libcudart.so for CUDA devices (adds Jetson support)	2024-03-25 11:07:44 -04:00
Daniel Hiltgen	85129d3a32	Adapt our build for imported server.cpp	2024-03-12 14:57:15 -07:00
John	23ebe8fe11	fix some typos (#2973 ) Signed-off-by: hishope <csqiye@126.com>	2024-03-06 22:50:11 -08:00
Bernhard M. Wiedemann	76e5d9ec88	Omit build date from gzip headers See https://reproducible-builds.org/ for why this is good. This patch was done while working on reproducible builds for openSUSE.	2024-02-29 16:48:19 +01:00
Daniel Hiltgen	e1f50377f4	Harden generate patching model Only apply patches if we have any, and make sure to cleanup every file we patched at the end to leave the tree clean	2024-02-01 19:34:36 -08:00
Daniel Hiltgen	e02ecfb6c8	Merge pull request #2116 from dhiltgen/cc_50_80 Add support for CUDA 5.0 cards	2024-01-27 10:28:38 -08:00
Jeffrey Morgan	a64570dcae	Fix clearing kv cache between requests with the same prompt (#2186 ) * Fix clearing kv cache between requests with the same prompt * fix powershell script	2024-01-25 13:46:20 -08:00
Daniel Hiltgen	a447a083f2	Add compute capability 5.0, 7.5, and 8.0	2024-01-20 14:24:05 -08:00
Jeffrey Morgan	4c54f0ddeb	sign dylibs on macOS (#2101 )	2024-01-19 19:24:11 -05:00
Jeffrey Morgan	dc88cc3981	use `gzip` for runner embedding (#2067 )	2024-01-19 13:23:03 -05:00
Daniel Hiltgen	1b249748ab	Add multiple CPU variants for Intel Mac This also refines the build process for the ext_server build.	2024-01-17 15:08:54 -08:00
Jeffrey Morgan	288ef8ff95	add `gcc -lstdc++` flag for linux cpu (#1974 )	2024-01-13 03:53:00 -05:00
Jeffrey Morgan	4cf17990f7	use g++ to build `libext_server.so` on linux (#1972 )	2024-01-13 03:12:42 -05:00
Daniel Hiltgen	d88c527be3	Build multiple CPU variants and pick the best This reduces the built-in linux version to not use any vector extensions which enables the resulting builds to run under Rosetta on MacOS in Docker. Then at runtime it checks for the actual CPU vector extensions and loads the best CPU library available	2024-01-11 08:42:47 -08:00
Bruce MacDonald	3367b5f3df	remove unused generate patches (#1810 )	2024-01-05 11:25:45 -05:00
Daniel Hiltgen	9983fa5f4e	Cleaup stale submodule If the tree has a stale submodule, make sure we clean it up first	2024-01-04 13:40:16 -08:00
Daniel Hiltgen	77d96da94b	Code shuffle to clean up the llm dir	2024-01-04 12:12:05 -08:00

18 commits