ollama

Author	SHA1	Message	Date
Daniel Hiltgen	b2799f111b	Move libraries out of users path We update the PATH on windows to get the CLI mapped, but this has an unintended side effect of causing other apps that may use our bundled DLLs to get terminated when we upgrade.	2024-06-17 13:12:18 -07:00
Jeffrey Morgan	152fc202f5	llm: update llama.cpp commit to `7c26775` (#4896 ) * llm: update llama.cpp submodule to `7c26775` * disable `LLAMA_BLAS` for now * `-DLLAMA_OPENMP=off`	2024-06-17 15:56:16 -04:00
Daniel Hiltgen	0577af98f4	More parallelism on windows generate Make the build faster	2024-06-15 07:44:55 -07:00
Daniel Hiltgen	ab8c929e20	Add ability to skip oneapi generate This follows the same pattern for cuda and rocm to allow disabling the build even when we detect the dependent libraries	2024-06-07 08:32:49 -07:00
Jeffrey Morgan	7ca9605f54	speed up tests by only building static lib (#4740 )	2024-05-30 21:43:15 -07:00
Daniel Hiltgen	646371f56d	Merge pull request #3278 from zhewang1-intc/rebase_ollama_main Enabling ollama to run on Intel GPUs with SYCL backend	2024-05-28 16:30:50 -07:00
Wang,Zhe	fd5971be0b	support ollama run on Intel GPUs	2024-05-24 11:18:27 +08:00
Daniel Hiltgen	c48c1d7c46	Port cuda/rocm skip build vars to linux Windows already implements these, carry over to linux.	2024-05-15 15:56:43 -07:00
Hernan Martinez	8a65717f55	Do not build AVX runners on ARM64	2024-04-26 23:55:32 -06:00
Hernan Martinez	b438d485f1	Use architecture specific folders in the generate script	2024-04-26 23:34:12 -06:00
Daniel Hiltgen	e4859c4563	Fine grain control over windows generate steps This will speed up CI which already tries to only build static for unit tests	2024-04-26 15:49:46 -07:00
Daniel Hiltgen	ed5fb088c4	Fix target in gen_windows.ps1	2024-04-26 15:10:42 -07:00
Daniel Hiltgen	421c878a2d	Put back non-avx CPU build for windows	2024-04-26 12:44:07 -07:00
Daniel Hiltgen	8671fdeda6	Refactor windows generate for more modular usage	2024-04-26 08:35:50 -07:00
Daniel Hiltgen	8feb97dc0d	Move cuda/rocm dependency gathering into generate script This will make it simpler for CI to accumulate artifacts from prior steps	2024-04-25 22:38:44 -07:00
Roy Yang	5f73c08729	Remove trailing spaces (#3889 )	2024-04-25 14:32:26 -04:00
Daniel Hiltgen	058f6cd2cc	Move nested payloads to installer and zip file on windows Now that the llm runner is an executable and not just a dll, more users are facing problems with security policy configurations on windows that prevent users writing to directories and then executing binaries from the same location. This change removes payloads from the main executable on windows and shifts them over to be packaged in the installer and discovered based on the executables location. This also adds a new zip file for people who want to "roll their own" installation model.	2024-04-23 16:14:47 -07:00
Daniel Hiltgen	cc5a71e0e3	Merge pull request #3709 from remy415/custom-gpu-defs Adds support for customizing GPU build flags in llama.cpp	2024-04-23 09:28:34 -07:00
Jeremy	9c0db4cc83	Update gen_windows.ps1 Fixed improper env references	2024-04-21 16:13:41 -04:00
Jeremy	6f18297b3a	Update gen_windows.ps1 Forgot a " on the write-host	2024-04-18 19:47:44 -04:00
Jeremy	15016413de	Update gen_windows.ps1 Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS to customize GPU builds on Windows	2024-04-18 19:27:16 -04:00
Jeremy	440b7190ed	Update gen_linux.sh Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS instead of OLLAMA_CUSTOM_GPU_DEFS	2024-04-18 19:18:10 -04:00
Jeremy	52f5370c48	add support for custom gpu build flags for llama.cpp	2024-04-17 16:00:48 -04:00
Jeremy	7c000ec3ed	adds support for OLLAMA_CUSTOM_GPU_DEFS to customize GPU build flags	2024-04-17 15:21:05 -04:00
Jeremy	8aec92fa6d	rearranged conditional logic for static build, dockerfile updated	2024-04-17 14:43:28 -04:00
Jeremy	70261b9bb6	move static build to its own flag	2024-04-17 13:04:28 -04:00
Blake Mizerany	1524f323a3	Revert "build.go: introduce a friendlier way to build Ollama (#3548 )" (#3564 )	2024-04-09 15:57:45 -07:00
Blake Mizerany	fccf3eecaa	build.go: introduce a friendlier way to build Ollama (#3548 ) This commit introduces a more friendly way to build Ollama dependencies and the binary without abusing `go generate` and removing the unnecessary extra steps it brings with it. This script also provides nicer feedback to the user about what is happening during the build process. At the end, it prints a helpful message to the user about what to do next (e.g. run the new local Ollama).	2024-04-09 14:18:47 -07:00
Jeffrey Morgan	63efa075a0	update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528 )	2024-04-07 19:29:51 -04:00
Daniel Hiltgen	dfe330fa1c	Merge pull request #3488 from mofanke/fix-windows-dll-compress fix dll compress in windows building	2024-04-04 16:12:13 -07:00
Daniel Hiltgen	36bd967722	Fail fast if mingw missing on windows	2024-04-04 09:51:26 -07:00
mofanke	4de0126719	fix dll compress in windows building	2024-04-04 21:27:33 +08:00
Daniel Hiltgen	e4a7e5b2ca	Fix CI release glitches The subprocess change moved the build directory arm64 builds weren't setting cross-compilation flags when building on x86	2024-04-03 16:41:40 -07:00
Jeffrey Morgan	cd135317d2	Fix macOS builds on older SDKs (#3467 )	2024-04-03 10:45:54 -07:00
Daniel Hiltgen	58d95cc9bd	Switch back to subprocessing for llama.cpp This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.	2024-04-01 16:48:18 -07:00
Jeffrey Morgan	856b8ec131	remove need for `$VSINSTALLDIR` since build will fail if `ninja` cannot be found (#3350 )	2024-03-26 16:23:16 -04:00
Jeremy	dfc6721b20	add support for libcudart.so for CUDA devices (adds Jetson support)	2024-03-25 11:07:44 -04:00
Daniel Hiltgen	ab3456207b	Merge pull request #3028 from ollama/ci_release CI release process	2024-03-15 16:40:54 -07:00
Daniel Hiltgen	6ad414f31e	Merge pull request #3086 from dhiltgen/import_server Import server.cpp to retain llava support	2024-03-15 16:10:35 -07:00
Daniel Hiltgen	d4c10df2b0	Add Radeon gfx940-942 GPU support	2024-03-15 15:34:58 -07:00
Daniel Hiltgen	540f4af45f	Wire up more complete CI for releases Flesh out our github actions CI so we can build official releaes.	2024-03-15 12:37:36 -07:00
Daniel Hiltgen	85129d3a32	Adapt our build for imported server.cpp	2024-03-12 14:57:15 -07:00
Jeffrey Morgan	369eda65f5	update llama.cpp submodule to `ceca1ae` (#3064 )	2024-03-11 12:57:48 -07:00
Daniel Hiltgen	bc13da2bfe	Avoid rocm runner and dependency clash Putting the rocm symlink next to the runners is risky. This moves the payloads into a subdir to avoid potential clashes.	2024-03-11 09:33:22 -07:00
Daniel Hiltgen	3dc1bb6a35	Harden for deps file being empty (or short)	2024-03-10 14:45:38 -07:00
Jeffrey Morgan	e11668aa07	add `bundle_metal` and `cleanup_metal` funtions to `gen_darwin.sh`	2024-03-09 16:04:57 -08:00
Jeffrey Morgan	1ffb1e2874	update llama.cpp submodule to `77d1ac7` (#3030 )	2024-03-09 15:55:34 -08:00
Daniel Hiltgen	6c5ccb11f9	Revamp ROCm support This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information.	2024-03-07 10:36:50 -08:00
John	23ebe8fe11	fix some typos (#2973 ) Signed-off-by: hishope <csqiye@126.com>	2024-03-06 22:50:11 -08:00
Bernhard M. Wiedemann	76e5d9ec88	Omit build date from gzip headers See https://reproducible-builds.org/ for why this is good. This patch was done while working on reproducible builds for openSUSE.	2024-02-29 16:48:19 +01:00

1 2

88 commits