ollama

Author	SHA1	Message	Date
Jeremy	440b7190ed	Update gen_linux.sh Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS instead of OLLAMA_CUSTOM_GPU_DEFS	2024-04-18 19:18:10 -04:00
Jeremy	52f5370c48	add support for custom gpu build flags for llama.cpp	2024-04-17 16:00:48 -04:00
Jeremy	7c000ec3ed	adds support for OLLAMA_CUSTOM_GPU_DEFS to customize GPU build flags	2024-04-17 15:21:05 -04:00
Blake Mizerany	1524f323a3	Revert "build.go: introduce a friendlier way to build Ollama (#3548 )" (#3564 )	2024-04-09 15:57:45 -07:00
Blake Mizerany	fccf3eecaa	build.go: introduce a friendlier way to build Ollama (#3548 ) This commit introduces a more friendly way to build Ollama dependencies and the binary without abusing `go generate` and removing the unnecessary extra steps it brings with it. This script also provides nicer feedback to the user about what is happening during the build process. At the end, it prints a helpful message to the user about what to do next (e.g. run the new local Ollama).	2024-04-09 14:18:47 -07:00
Jeffrey Morgan	63efa075a0	update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528 )	2024-04-07 19:29:51 -04:00
Daniel Hiltgen	dfe330fa1c	Merge pull request #3488 from mofanke/fix-windows-dll-compress fix dll compress in windows building	2024-04-04 16:12:13 -07:00
Daniel Hiltgen	36bd967722	Fail fast if mingw missing on windows	2024-04-04 09:51:26 -07:00
mofanke	4de0126719	fix dll compress in windows building	2024-04-04 21:27:33 +08:00
Daniel Hiltgen	e4a7e5b2ca	Fix CI release glitches The subprocess change moved the build directory arm64 builds weren't setting cross-compilation flags when building on x86	2024-04-03 16:41:40 -07:00
Jeffrey Morgan	cd135317d2	Fix macOS builds on older SDKs (#3467 )	2024-04-03 10:45:54 -07:00
Daniel Hiltgen	58d95cc9bd	Switch back to subprocessing for llama.cpp This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.	2024-04-01 16:48:18 -07:00
Jeffrey Morgan	856b8ec131	remove need for `$VSINSTALLDIR` since build will fail if `ninja` cannot be found (#3350 )	2024-03-26 16:23:16 -04:00
Jeremy	dfc6721b20	add support for libcudart.so for CUDA devices (adds Jetson support)	2024-03-25 11:07:44 -04:00
Daniel Hiltgen	ab3456207b	Merge pull request #3028 from ollama/ci_release CI release process	2024-03-15 16:40:54 -07:00
Daniel Hiltgen	6ad414f31e	Merge pull request #3086 from dhiltgen/import_server Import server.cpp to retain llava support	2024-03-15 16:10:35 -07:00
Daniel Hiltgen	d4c10df2b0	Add Radeon gfx940-942 GPU support	2024-03-15 15:34:58 -07:00
Daniel Hiltgen	540f4af45f	Wire up more complete CI for releases Flesh out our github actions CI so we can build official releaes.	2024-03-15 12:37:36 -07:00
Daniel Hiltgen	85129d3a32	Adapt our build for imported server.cpp	2024-03-12 14:57:15 -07:00
Jeffrey Morgan	369eda65f5	update llama.cpp submodule to `ceca1ae` (#3064 )	2024-03-11 12:57:48 -07:00
Daniel Hiltgen	bc13da2bfe	Avoid rocm runner and dependency clash Putting the rocm symlink next to the runners is risky. This moves the payloads into a subdir to avoid potential clashes.	2024-03-11 09:33:22 -07:00
Daniel Hiltgen	3dc1bb6a35	Harden for deps file being empty (or short)	2024-03-10 14:45:38 -07:00
Jeffrey Morgan	e11668aa07	add `bundle_metal` and `cleanup_metal` funtions to `gen_darwin.sh`	2024-03-09 16:04:57 -08:00
Jeffrey Morgan	1ffb1e2874	update llama.cpp submodule to `77d1ac7` (#3030 )	2024-03-09 15:55:34 -08:00
Daniel Hiltgen	6c5ccb11f9	Revamp ROCm support This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information.	2024-03-07 10:36:50 -08:00
John	23ebe8fe11	fix some typos (#2973 ) Signed-off-by: hishope <csqiye@126.com>	2024-03-06 22:50:11 -08:00
Bernhard M. Wiedemann	76e5d9ec88	Omit build date from gzip headers See https://reproducible-builds.org/ for why this is good. This patch was done while working on reproducible builds for openSUSE.	2024-02-29 16:48:19 +01:00
Jeffrey Morgan	efe040f8c0	reset with `init_vars` ahead of each cpu build in `gen_windows.ps1` (#2654 )	2024-02-21 16:35:34 -05:00
Daniel Hiltgen	4fcbf1cde6	Merge pull request #2599 from dhiltgen/fix_avx Explicitly disable AVX2 on GPU builds	2024-02-19 13:13:05 -08:00
Daniel Hiltgen	df6dc4fd96	Fix duplicate menus on update and exit on signals Also fixes a few fit-and-finish items for better developer experience	2024-02-16 15:33:16 -08:00
Daniel Hiltgen	db2a9ad1fe	Explicitly disable AVX2 on GPU builds Even though we weren't setting it to on, somewhere in the cmake config it was getting toggled on. By explicitly setting it to off, we get `/arch:AVX` as intended.	2024-02-15 14:50:11 -08:00
Daniel Hiltgen	29e90cc13b	Implement new Go based Desktop app This focuses on Windows first, but coudl be used for Mac and possibly linux in the future.	2024-02-15 05:56:45 +00:00
Daniel Hiltgen	6d84f07505	Detect AMD GPU info via sysfs and block old cards This wires up some new logic to start using sysfs to discover AMD GPU information and detects old cards we can't yet support so we can fallback to CPU mode.	2024-02-12 08:19:41 -08:00
Daniel Hiltgen	27aa2d4a19	Merge pull request #1849 from mraiser/main Accomodate split cuda lib dir	2024-02-05 16:01:16 -08:00
Daniel Hiltgen	e1f50377f4	Harden generate patching model Only apply patches if we have any, and make sure to cleanup every file we patched at the end to leave the tree clean	2024-02-01 19:34:36 -08:00
mraiser	4c4c730a0a	Merge branch 'ollama:main' into main	2024-01-27 21:56:11 -05:00
Daniel Hiltgen	e02ecfb6c8	Merge pull request #2116 from dhiltgen/cc_50_80 Add support for CUDA 5.0 cards	2024-01-27 10:28:38 -08:00
Jeffrey Morgan	a64570dcae	Fix clearing kv cache between requests with the same prompt (#2186 ) * Fix clearing kv cache between requests with the same prompt * fix powershell script	2024-01-25 13:46:20 -08:00
mraiser	a4564232a4	Update gen_linux.sh to find libcudart in separate directory	2024-01-25 09:49:35 -05:00
Daniel Hiltgen	0f5b843319	Refine Accelerate usage on mac For old macs, accelerate seems to cause crashes, but for AVX2 capable macs, it does not.	2024-01-22 16:25:56 -08:00
Daniel Hiltgen	df54c723ae	Make CPU builds parallel and customizable AMD GPUs The linux build now support parallel CPU builds to speed things up. This also exposes AMD GPU targets as an optional setting for advaced users who want to alter our default set.	2024-01-21 15:12:21 -08:00
Daniel Hiltgen	a447a083f2	Add compute capability 5.0, 7.5, and 8.0	2024-01-20 14:24:05 -08:00
Daniel Hiltgen	681a914990	Add support for CUDA 5.2 cards	2024-01-20 10:48:43 -08:00
Jeffrey Morgan	4c54f0ddeb	sign dylibs on macOS (#2101 )	2024-01-19 19:24:11 -05:00
Jeffrey Morgan	dc88cc3981	use `gzip` for runner embedding (#2067 )	2024-01-19 13:23:03 -05:00
Daniel Hiltgen	fccdf4c635	Merge pull request #1987 from xyproto/archlinux Let gpu.go and gen_linux.sh also find CUDA on Arch Linux	2024-01-18 13:32:10 -08:00
Daniel Hiltgen	1b249748ab	Add multiple CPU variants for Intel Mac This also refines the build process for the ext_server build.	2024-01-17 15:08:54 -08:00
Alexander F. Rødseth	cbe2adc78a	Merge branch 'main' into archlinux	2024-01-17 12:50:11 +01:00
Daniel Hiltgen	795674dd90	Bump llama.cpp to b1842 and add new cuda lib dep Upstream llama.cpp has added a new dependency with the NVIDIA CUDA Driver Libraries (libcuda.so) which is part of the driver distribution, not the general cuda libraries, and is not available as an archive, so we can not statically link it. This may introduce some additional compatibility challenges which we'll need to keep an eye on.	2024-01-16 12:53:52 -08:00
Daniel Hiltgen	8795447dad	Merge pull request #1966 from fpreiss/fpreiss/gen_linux_cuda_detection improve cuda detection (rel. issue #1704)	2024-01-14 18:00:11 -08:00

1 2

65 commits