ollama

Author	SHA1	Message	Date
Daniel Hiltgen	cd5c8f6471	Optimize container images for startup (#6547 ) * Optimize container images for startup This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images. * Refactor payload logic and add buildx support for faster builds * Move payloads around * Review comments * Converge to buildx based helper scripts * Use docker buildx action for release	2024-09-12 12:10:30 -07:00
Daniel Hiltgen	4a8069f9c4	Quiet down dockers new lint warnings (#6716 ) * Quiet down dockers new lint warnings Docker has recently added lint warnings to build. This cleans up those warnings. * Fix go lint regression	2024-09-09 17:22:20 -07:00
R0CKSTAR	9df5f0e8e4	Reduce docker image size (#5847 ) Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>	2024-09-03 09:25:31 -07:00
Daniel Hiltgen	a017cf2fea	Split rocm back out of bundle (#6432 ) We're over budget for github's maximum release artifact size with rocm + 2 cuda versions. This splits rocm back out as a discrete artifact, but keeps the layout so it can be extracted into the same location as the main bundle.	2024-08-20 07:26:38 -07:00
Daniel Hiltgen	88bb9e3328	Adjust layout to bin+lib/ollama	2024-08-19 09:38:53 -07:00
Daniel Hiltgen	3b19cdba2a	Remove Jetpack	2024-08-19 09:38:53 -07:00
Daniel Hiltgen	f6c811b320	Enable cuda v12 flags	2024-08-19 09:38:53 -07:00
Daniel Hiltgen	4fe3a556fa	Add cuda v12 variant and selection logic Based on compute capability and driver version, pick v12 or v11 cuda variants.	2024-08-19 09:38:53 -07:00
Daniel Hiltgen	d470ebe78b	Add Jetson cuda variants for arm This adds new variants for arm64 specific to Jetson platforms	2024-08-19 09:38:53 -07:00
Daniel Hiltgen	c7bcb00319	Wire up ccache and pigz in the docker based build This should help speed things up a little	2024-08-19 09:38:53 -07:00
Daniel Hiltgen	74d45f0102	Refactor linux packaging This adjusts linux to follow a similar model to windows with a discrete archive (zip/tgz) to cary the primary executable, and dependent libraries. Runners are still carried as payloads inside the main binary Darwin retain the payload model where the go binary is fully self contained.	2024-08-19 09:38:53 -07:00
lreed	f02f83660c	bump go version to 1.22.5 to fix security vulnerabilities	2024-07-17 21:44:19 +00:00
Daniel Hiltgen	224337b32f	Bump linux ROCm to 6.1.2	2024-07-15 15:10:22 -07:00
Daniel Hiltgen	020bd60ab2	Switch amd container image base to rocky 8 The centos 7 arm mirrors have disappeared due to the EOL 2 days ago, and the vault sed workaround which works for x86 doesn't work for arm.	2024-07-02 10:34:47 -07:00
Daniel Hiltgen	26ab67732b	Bump ROCm linux to 6.1.1	2024-06-14 15:37:54 -07:00
Jeremy	8aec92fa6d	rearranged conditional logic for static build, dockerfile updated	2024-04-17 14:43:28 -04:00
Jeremy	70261b9bb6	move static build to its own flag	2024-04-17 13:04:28 -04:00
Daniel Hiltgen	c2d813bdc3	Fix rocm deps with new subprocess paths	2024-04-11 12:52:06 -07:00
Daniel Hiltgen	58d95cc9bd	Switch back to subprocessing for llama.cpp This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.	2024-04-01 16:48:18 -07:00
Daniel Hiltgen	c91a4ebcff	Bump ROCm to 6.0.2 patch release	2024-03-28 15:58:57 -07:00
Patrick Devine	1b272d5bcd	change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347 )	2024-03-26 13:04:17 -07:00
Daniel Hiltgen	e0319bd78d	Revert "Switch arm cuda base image to centos 7" This reverts commit `5dacc1ebe8`.	2024-03-25 19:01:11 -07:00
Daniel Hiltgen	5dacc1ebe8	Switch arm cuda base image to centos 7 We had started using rocky linux 8, but they've updated to GCC 10.3, which breaks NVCC. 10.2 is compatible (or 10.4, but that's not available from rocky linux 8 repos yet)	2024-03-25 15:57:08 -07:00
Bruce MacDonald	a5ba0fcf78	doc: faq gpu compatibility (#3142 )	2024-03-21 05:21:34 -04:00
Daniel Hiltgen	540f4af45f	Wire up more complete CI for releases Flesh out our github actions CI so we can build official releaes.	2024-03-15 12:37:36 -07:00
Jeffrey Morgan	b5fcd9d3aa	use `-trimpath` when building releases (#3069 )	2024-03-11 15:58:46 -07:00
Daniel Hiltgen	82ca694d68	Rename ROCm deps file to avoid confusion (#3025 )	2024-03-09 17:48:38 -08:00
Daniel Hiltgen	6c5ccb11f9	Revamp ROCm support This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information.	2024-03-07 10:36:50 -08:00
Jeffrey Morgan	d481fb3cc8	update go to 1.22 in other places (#2975 )	2024-03-07 07:39:49 -08:00
Daniel Hiltgen	794a916a72	Add env var so podman will map cuda GPUs Without this env var, podman's GPU logic doesn't map the GPU through	2024-02-29 08:43:08 -08:00
Daniel Hiltgen	75c44aa319	Add back ROCm container support This adds ROCm support back as a discrete image.	2024-01-26 09:24:29 -08:00
Daniel Hiltgen	a34e1ad3cf	Switch back to ubuntu base The size increase for rocm support in the standard image is problematic We'll revisit multiple tags for rocm support in a follow up PR.	2024-01-25 16:46:01 -08:00
Daniel Hiltgen	df54c723ae	Make CPU builds parallel and customizable AMD GPUs The linux build now support parallel CPU builds to speed things up. This also exposes AMD GPU targets as an optional setting for advaced users who want to alter our default set.	2024-01-21 15:12:21 -08:00
Daniel Hiltgen	da72235ebf	Combine the 2 Dockerfiles and add ROCm This renames Dockerfile.build to Dockerfile, and adds some new stages to support 2 modes of building - the build_linux.sh script uses intermediate stages to extract the artifacts for ./dist, and the default build generates a container image usable by both cuda and rocm cards. This required transitioniing the x86 base to the rocm image to avoid layer bloat.	2024-01-21 11:37:11 -08:00
Michael Yang	0409c1fa59	docker: set PATH, LD_LIBRARY_PATH, and capabilities (#1336 ) * docker: set PATH, LD_LIBRARY_PATH, and capabilities * example: update k8s gpu manifest	2023-11-30 21:16:56 -08:00
Jeffrey Morgan	89ba19feca	use Go `1.21.3` in `Dockerfile`	2023-10-12 23:23:12 -04:00
Jeffrey Morgan	dc87e9c9ae	update `Dockerfile` to pass `GOFLAGS`	2023-10-03 07:05:15 -07:00
Michael Yang	0a4f21c0a7	fix docker build (#659 )	2023-09-30 13:34:01 -07:00
Jeffrey Morgan	9abb66254a	docker: fix volume permission errors	2023-09-30 12:32:15 -07:00
Michael Yang	92d454ec5f	update build_darwin.sh	2023-09-29 11:29:23 -07:00
Jeffrey Morgan	2ded8ab206	use `11.8.0` nvidia dockerfile base image for now	2023-09-26 21:48:41 -07:00
Michael Yang	93d3a2568d	replace dockerfile	2023-09-22 11:57:38 -07:00
Michael Yang	9aa192c812	update cuda docker image	2023-09-14 11:25:20 -07:00
Michael Yang	9795b43d93	update dockerfile	2023-09-06 15:31:25 -07:00
Jeffrey Morgan	7c71c10d4f	fix compilation issue in Dockerfile, remove from `README.md` until ready	2023-07-11 19:51:08 -07:00
Jeffrey Morgan	ea809df196	update `Dockerfile` to use OLLAMA_HOST	2023-07-07 23:43:50 -04:00
Jeffrey Morgan	fdb93ef2aa	fix dockerfile	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	6292f4b64c	update `Dockerfile`	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	48920c873b	add basic `Dockerfile`	2023-06-30 12:19:04 -04:00
Jeffrey Morgan	54a94566f1	add `Dockerfile`	2023-06-30 10:47:55 -04:00

50 commits