ollama

Author	SHA1	Message	Date
Daniel Hiltgen	0fdebb34a9	Doc how to set up ROCm builds on windows	2024-03-09 11:29:45 -08:00
Daniel Hiltgen	6c5ccb11f9	Revamp ROCm support This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information.	2024-03-07 10:36:50 -08:00
Jeffrey Morgan	d481fb3cc8	update go to 1.22 in other places (#2975 )	2024-03-07 07:39:49 -08:00
John	23ebe8fe11	fix some typos (#2973 ) Signed-off-by: hishope <csqiye@126.com>	2024-03-06 22:50:11 -08:00
Daniel Hiltgen	e02ecfb6c8	Merge pull request #2116 from dhiltgen/cc_50_80 Add support for CUDA 5.0 cards	2024-01-27 10:28:38 -08:00
Daniel Hiltgen	df54c723ae	Make CPU builds parallel and customizable AMD GPUs The linux build now support parallel CPU builds to speed things up. This also exposes AMD GPU targets as an optional setting for advaced users who want to alter our default set.	2024-01-21 15:12:21 -08:00
Daniel Hiltgen	a447a083f2	Add compute capability 5.0, 7.5, and 8.0	2024-01-20 14:24:05 -08:00
Daniel Hiltgen	abec7f06e5	Merge pull request #2056 from dhiltgen/slog Mechanical switch from log to slog	2024-01-18 14:27:24 -08:00
Daniel Hiltgen	ecbfc0182f	Go bump to v1.21 to pick up slog	2024-01-18 14:12:57 -08:00
Daniel Hiltgen	fedd705aea	Mechanical switch from log to slog A few obvious levels were adjusted, but generally everything mapped to "info" level.	2024-01-18 14:12:57 -08:00
Daniel Hiltgen	9cd20b0ec8	Refine the linux cuda/rocm developer docs	2024-01-18 09:44:44 -08:00
Daniel Hiltgen	d88c527be3	Build multiple CPU variants and pick the best This reduces the built-in linux version to not use any vector extensions which enables the resulting builds to run under Rosetta on MacOS in Docker. Then at runtime it checks for the actual CPU vector extensions and loads the best CPU library available	2024-01-11 08:42:47 -08:00
Daniel Hiltgen	e201efa14b	Add windows native build instructions	2023-12-25 08:31:34 -08:00
Daniel Hiltgen	e5202eb687	Quiet down llama.cpp logging by default By default builds will now produce non-debug and non-verbose binaries. To enable verbose logs in llama.cpp and debug symbols in the native code, set `CGO_CFLAGS=-g`	2023-12-22 08:47:18 -08:00
Daniel Hiltgen	1b991d0ba9	Refine build to support CPU only If someone checks out the ollama repo and doesn't install the CUDA library, this will ensure they can build a CPU only version	2023-12-19 09:05:46 -08:00
Jiayu Liu	4fc10acce9	add some missing code directives in docs (#664 )	2023-10-01 11:51:01 -07:00
Michael Yang	6c6a31a1e8	embed libraries using cmake	2023-09-20 14:41:57 -07:00
Bruce MacDonald	fc6ec356fc	remove libcuda.so	2023-09-20 20:36:14 +01:00
Bruce MacDonald	1255bc9b45	only package 11.8 runner	2023-09-20 20:00:41 +01:00
Bruce MacDonald	4e8be787c7	pack in cuda libs	2023-09-20 17:40:42 +01:00
Bruce MacDonald	2540c9181c	support for packaging in multiple cuda runners (#509 ) * enable packaging multiple cuda versions * use nvcc cuda version if available --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-14 15:08:13 -04:00
Bruce MacDonald	f221637053	first pass at linux gpu support (#454 ) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-12 11:04:35 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Michael Yang	041f9ad1a1	update README.md	2023-08-25 11:44:25 -07:00
Jeffrey Morgan	1f78e409b4	docs: format with `prettier`	2023-08-08 15:41:48 -07:00
Michael Yang	24e43e3212	update development.md	2023-07-24 09:43:57 -07:00
Bruce MacDonald	52f04e39f2	Note that CGO must be enabled in dev docs	2023-07-21 22:36:36 +02:00
Matt Williams	3d9498dc95	Some simple modelfile examples Signed-off-by: Matt Williams <m@technovangelist.com>	2023-07-17 17:16:59 -07:00
Jeffrey Morgan	1358e27b77	add publish script	2023-07-07 12:59:45 -04:00
Michael Yang	9811956938	update development.md	2023-06-28 12:41:30 -07:00
Jeffrey Morgan	9ba58c8a9e	move desktop docs to `desktop/`	2023-06-28 11:29:29 -04:00
Jeffrey Morgan	9f868d8258	move desktop docs to `desktop/`	2023-06-28 11:27:18 -04:00
Bruce MacDonald	4018b3c533	poetry development	2023-06-28 11:17:08 -04:00
Bruce MacDonald	ecfb4abafb	simplify loading	2023-06-27 14:50:30 -04:00
Michael Chiang	2906cbab11	Update development.md	2023-06-27 14:07:31 -04:00
Michael Chiang	9d14e75185	Update development.md	2023-06-27 14:06:59 -04:00
Michael Chiang	a2745f8174	Update development.md	2023-06-27 14:06:49 -04:00
Jeffrey Morgan	20cdd9fee6	update `README.md`	2023-06-27 13:51:20 -04:00
Bruce MacDonald	11614b6d84	add development doc	2023-06-27 13:46:46 -04:00

39 commits