ollama

Author	SHA1	Message	Date
Daniel Hiltgen	b123be5b71	Adjust context size for parallelism	2024-04-25 13:58:54 -07:00
jmorganca	ddf5c09a9b	use matrix multiplcation kernels in more cases	2024-04-25 13:58:54 -07:00
Roy Yang	5f73c08729	Remove trailing spaces (#3889 )	2024-04-25 14:32:26 -04:00
Daniel Hiltgen	f503a848c2	Merge pull request #3895 from brycereitano/shiftloading Move ggml loading to when attempting to fit	2024-04-25 09:24:08 -07:00
Bryce Reitano	36a6daccab	Restructure loading conditional chain	2024-04-24 17:37:03 -06:00
Bryce Reitano	ceb0e26e5e	Provide variable ggml for TestLoad	2024-04-24 17:19:55 -06:00
Bryce Reitano	284e02bed0	Move ggml loading to when we attempt fitting	2024-04-24 17:17:24 -06:00
Michael Yang	3450a57d4a	Merge pull request #3713 from ollama/mxyng/modelname update copy handler to use model.Name	2024-04-24 16:00:32 -07:00
Michael Yang	592dae31c8	update copy to use model.Name	2024-04-24 15:54:54 -07:00
Michael Yang	2010cbc5fa	Merge pull request #3833 from ollama/mxyng/fix-from fix: from blob	2024-04-24 15:13:47 -07:00
Michael Yang	ac0801eced	only replace if it matches command	2024-04-24 14:49:26 -07:00
Michael Yang	ad66e5b060	split temp zip files	2024-04-24 14:18:01 -07:00
Blake Mizerany	ade4b55520	types/model: make ParseName use default without question (#3886 )	2024-04-24 11:52:55 -07:00
Daniel Hiltgen	a6d62e0617	Merge pull request #3882 from dhiltgen/amd_gfx AMD gfx patch rev is hex	2024-04-24 11:07:49 -07:00
Daniel Hiltgen	6e76348df7	Merge pull request #3834 from dhiltgen/not_found_in_path Report errors on server lookup instead of path lookup failure	2024-04-24 10:50:48 -07:00
Daniel Hiltgen	0d6687f84c	AMD gfx patch rev is hex Correctly handle gfx90a discovery	2024-04-24 09:43:52 -07:00
Patrick Devine	74d2a9ef9a	add OLLAMA_KEEP_ALIVE env variable to FAQ (#3865 )	2024-04-23 21:06:51 -07:00
Patrick Devine	14476d48cc	fixes for gguf (#3863 )	2024-04-23 20:57:20 -07:00
Patrick Devine	ce8ce82567	add mixtral 8x7b model conversion (#3859 )	2024-04-23 20:17:04 -07:00
Blake Mizerany	4dc4f1be34	types/model: restrict digest hash part to a minimum of 2 characters (#3858 ) This allows users of a valid Digest to know it has a minimum of 2 characters in the hash part for use when sharding. This is a reasonable restriction as the hash part is a SHA256 hash which is 64 characters long, which is the common hash used. There is no anticipation of using a hash with less than 2 characters. Also, add MustParseDigest. Also, replace Digest.Type with Digest.Split for getting both the type and hash parts together, which is most the common case when asking for either.	2024-04-23 18:24:17 -07:00
Daniel Hiltgen	16b52331a4	Merge pull request #3857 from dhiltgen/mem_escape_valve Add back memory escape valve	2024-04-23 17:32:24 -07:00
Daniel Hiltgen	5445aaa94e	Add back memory escape valve If we get our predictions wrong, this can be used to set a lower memory limit as a workaround. Recent multi-gpu refactoring accidentally removed it, so this adds it back.	2024-04-23 17:09:02 -07:00
Daniel Hiltgen	2ac3dd6853	Merge pull request #3850 from dhiltgen/windows_packaging Move nested payloads to installer and zip file on windows	2024-04-23 16:35:20 -07:00
Daniel Hiltgen	d8851cb7a0	Harden sched TestLoad Give the go routine a moment to deliver the expired event	2024-04-23 16:14:47 -07:00
Daniel Hiltgen	058f6cd2cc	Move nested payloads to installer and zip file on windows Now that the llm runner is an executable and not just a dll, more users are facing problems with security policy configurations on windows that prevent users writing to directories and then executing binaries from the same location. This change removes payloads from the main executable on windows and shifts them over to be packaged in the installer and discovered based on the executables location. This also adds a new zip file for people who want to "roll their own" installation model.	2024-04-23 16:14:47 -07:00
Daniel Hiltgen	790cf34d17	Merge pull request #3846 from dhiltgen/missing_runner Detect and recover if runner removed	2024-04-23 13:14:12 -07:00
Michael	928d844896	adding phi-3 mini to readme adding phi-3 mini to readme	2024-04-23 13:58:31 -04:00
Daniel Hiltgen	939d6a8606	Make CI lint verbvose	2024-04-23 10:17:42 -07:00
Daniel Hiltgen	58888a74bc	Detect and recover if runner removed Tmp cleaners can nuke the file out from underneath us. This detects the missing runner, and re-initializes the payloads.	2024-04-23 10:05:26 -07:00
Daniel Hiltgen	cc5a71e0e3	Merge pull request #3709 from remy415/custom-gpu-defs Adds support for customizing GPU build flags in llama.cpp	2024-04-23 09:28:34 -07:00
Michael Yang	e83bcf7f9a	Merge pull request #3836 from ollama/mxyng/mixtral fix: mixtral graph	2024-04-23 09:15:10 -07:00
Daniel Hiltgen	5690e5ce99	Merge pull request #3418 from dhiltgen/concurrency Request and model concurrency	2024-04-23 08:31:38 -07:00
Daniel Hiltgen	f2ea8470e5	Local unicode test case	2024-04-22 19:29:12 -07:00
Daniel Hiltgen	34b9db5afc	Request and model concurrency This change adds support for multiple concurrent requests, as well as loading multiple models by spawning multiple runners. The default settings are currently set at 1 concurrent request per model and only 1 loaded model at a time, but these can be adjusted by setting OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.	2024-04-22 19:29:12 -07:00
Daniel Hiltgen	8711d03df7	Report errors on server lookup instead of path lookup failure	2024-04-22 19:08:47 -07:00
Daniel Hiltgen	ee448deaba	Merge pull request #3835 from dhiltgen/harden_llm_override Trim spaces and quotes from llm lib override	2024-04-22 19:06:54 -07:00
Bruce MacDonald	6e8db04716	tidy community integrations - move some popular integrations to the top of the lists	2024-04-22 17:29:08 -07:00
Bruce MacDonald	658e60cf73	Revert "stop running model on interactive exit" This reverts commit `fad00a85e5`.	2024-04-22 17:23:11 -07:00
Bruce MacDonald	4c78f028f8	Merge branch 'main' of https://github.com/ollama/ollama	2024-04-22 17:22:28 -07:00
Michael Yang	435cc866a3	fix: mixtral graph	2024-04-22 17:19:44 -07:00
Hao Wu	c7d3a558f6	docs: update README to add chat (web UI) for LLM (#3810 ) * add chat (web UI) for LLM I have used chat with llama3 in local successfully and the code is MIT licensed. * Update README.md --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2024-04-22 20:19:39 -04:00
Maple Gao	089cdb2877	docs: Update README for Lobe-chat integration. (#3817 ) Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2024-04-22 20:18:15 -04:00
Võ Đình Đạt	ea1e9aa36b	Update README.md (#3655 )	2024-04-22 20:16:55 -04:00
Jonathan Smoley	d0d28ef90d	Update README.md with Discord-Ollama project (#3633 ) Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2024-04-22 20:14:20 -04:00
Eric Curtin	6654186a7c	Add podman-ollama to terminal apps (#3626 ) The goal of podman-ollama is to make AI even more boring. Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2024-04-22 20:13:23 -04:00
Daniel Hiltgen	aa72281eae	Trim spaces and quotes from llm lib override	2024-04-22 17:11:14 -07:00
reid41	74bcbf828f	add qa-pilot link (#3612 ) * add qa-pilot link * format the link * add shell-pilot	2024-04-22 20:10:34 -04:00
Christian Neff	fe39147e64	Add Chatbot UI v2 to Community Integrations (#3503 )	2024-04-22 20:09:55 -04:00
Bruce MacDonald	fad00a85e5	stop running model on interactive exit	2024-04-22 16:22:14 -07:00
Jeremy	9c0db4cc83	Update gen_windows.ps1 Fixed improper env references	2024-04-21 16:13:41 -04:00

... 17 18 19 20 21 ...

3369 commits