ollama

Author	SHA1	Message	Date
Michael Yang	01114b4526	fix: rope	2024-04-09 16:15:24 -07:00
Michael Yang	9502e5661f	cgo quantize	2024-04-08 15:31:08 -07:00
Michael Yang	e1c9a2a00f	no blob create if already exists	2024-04-08 15:09:48 -07:00
Michael Yang	be517e491c	no rope parameters	2024-04-05 18:05:27 -07:00
Patrick Devine	1b272d5bcd	change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347 )	2024-03-26 13:04:17 -07:00
Patrick Devine	47cfe58af5	Default Keep Alive environment variable (#3094 ) --------- Co-authored-by: Chris-AS1 <8493773+Chris-AS1@users.noreply.github.com>	2024-03-13 13:29:40 -07:00
Jeffrey Morgan	3b4bab3dc5	Fix embeddings load model behavior (#2848 )	2024-02-29 17:40:56 -08:00
Ikko Eltociear Ashimine	e95b896790	Update types.go (#2744 ) specfied -> specified	2024-02-25 13:41:25 -05:00
Michael Yang	897b213468	use http.DefaultClient (#2530 ) default client already handles proxy	2024-02-20 18:34:47 -05:00
bnorick	caf2b13c10	Fix infinite keep_alive (#2480 )	2024-02-13 15:40:32 -08:00
Patrick Devine	b5cf31b460	add keep_alive to generate/chat/embedding api endpoints (#2146 )	2024-01-26 14:28:02 -08:00
Patrick Devine	7c40a67841	Save and load sessions (#2063 )	2024-01-25 12:12:36 -08:00
Michael Yang	745b5934fa	add model to ModelResponse	2024-01-18 14:32:55 -08:00
Michael Yang	a38d88d828	api: add model for all requests prefer using req.Model and fallback to req.Name	2024-01-18 14:31:37 -08:00
Michael Yang	5ffbbea1d7	remove client.py	2024-01-11 15:53:10 -08:00
Patrick Devine	22e93efa41	add show info command and fix the modelfile	2024-01-05 12:20:05 -08:00
Brian Murray	0d6e3565ae	Add embeddings to API (#1773 )	2024-01-04 15:00:52 -05:00
Jeffrey Morgan	55978c1dc9	clean up cache api option	2023-12-27 14:27:45 -05:00
Jeffrey Morgan	d4ebdadbe7	enable `cache_prompt` by default	2023-12-27 14:23:42 -05:00
K0IN	10da41d677	Add Cache flag to api (#1642 )	2023-12-22 17:16:20 -05:00
Bruce MacDonald	d99fa6ce0a	send empty messages on last chat response (#1530 )	2023-12-18 14:23:38 -05:00
Patrick Devine	d9e60f634b	add image support to the chat api (#1490 )	2023-12-12 13:28:58 -08:00
Patrick Devine	910e9401d0	Multimodal support (#1216 ) --------- Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>	2023-12-11 13:56:22 -08:00
Jeffrey Morgan	9e1406e4ed	Don't expose model information in `/api/generate`	2023-12-09 02:05:43 -08:00
Michael Yang	c3ff36088b	Merge pull request #774 from jmorganca/mxyng/server-version add version api and show server version in cli	2023-12-06 13:22:55 -08:00
Michael Yang	5d75505ebd	return model configuration in generate	2023-12-05 14:39:02 -08:00
Bruce MacDonald	195e3d9dbd	chat api endpoint (#1392 )	2023-12-05 14:57:33 -05:00
Michael Yang	0db4706ec2	api: add version api handler	2023-12-05 09:36:01 -08:00
Jeffrey Morgan	00d06619a1	Revert "chat api (#991 )" while context variable is fixed This reverts commit `7a0899d62d`.	2023-12-04 21:16:27 -08:00
Bruce MacDonald	7a0899d62d	chat api (#991 ) - update chat docs - add messages chat endpoint - remove deprecated context and template generate parameters from docs - context and template are still supported for the time being and will continue to work as expected - add partial response to chat history	2023-12-04 18:01:06 -05:00
Patrick Devine	cde31cb220	Allow setting parameters in the REPL (#1294 )	2023-11-29 09:56:42 -08:00
Bruce MacDonald	928950fcc6	update python client create example (#1227 ) * add remote create to python example client	2023-11-27 15:36:19 -05:00
Michael Yang	bc22d5a38b	no blob response	2023-11-15 15:16:23 -08:00
Michael Yang	1901044b07	use checksum reference	2023-11-15 15:16:23 -08:00
Michael Yang	1552cee59f	client create modelfile	2023-11-15 15:16:23 -08:00
Michael Yang	3ca56b5ada	add create modelfile field	2023-11-15 15:16:23 -08:00
Jeffrey Morgan	cdddd3df65	add `format` to example python client	2023-11-10 10:22:21 -08:00
Jeffrey Morgan	5cba29b9d6	JSON mode: add `"format" as an api parameter (#1051 ) * add `"format": "json"` as an API parameter --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2023-11-09 16:44:02 -08:00
Bruce MacDonald	a49d6acc1e	add a complete /generate options example (#1035 )	2023-11-08 16:44:36 -08:00
Bruce MacDonald	ec2a31e9b3	support raw generation requests (#952 ) - add the optional `raw` generate request parameter to bypass prompt formatting and response context -add raw request to docs	2023-11-08 14:05:02 -08:00
Jeffrey Morgan	17678b7225	Restore system prompt on requests and default `num_keep` to `0`	2023-11-03 13:25:25 -07:00
Jeffrey Morgan	06589a3b30	Set `NumKeep` to `4` by default (#982 )	2023-11-02 17:26:11 -07:00
Michael Yang	1fd511e661	Merge pull request #975 from jmorganca/mxyng/downloads update downloads to use retry wrapper	2023-11-02 16:12:48 -07:00
Michael Yang	6db3691b8f	update default NumKeep	2023-11-02 15:47:35 -07:00
Michael Yang	60bb3c03a1	use http.Method	2023-11-02 13:12:45 -07:00
Bruce MacDonald	5c3491f425	allow for a configurable ollama model storage directory (#897 ) * allow for a configurable ollama models directory - set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored - update docs Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com> Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com> Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com> Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>	2023-10-27 10:19:59 -04:00
Michael Yang	28c3f288e2	client: fix trailing slash	2023-10-26 11:09:38 -07:00
Michael Yang	459f4a7889	fix: ollama host for hostname	2023-10-20 11:32:41 -07:00
Bruce MacDonald	fe6f3b48f7	do not reload the running llm when runtime params change (#840 ) - only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context	2023-10-19 10:39:58 -04:00
Michael Yang	92189a5855	fix memory check	2023-10-13 14:47:29 -07:00
Bruce MacDonald	6fe178134d	improve api error handling (#781 ) - remove new lines from llama.cpp error messages relayed to client - check api option types and return error on wrong type - change num layers from 95% VRAM to 92% VRAM	2023-10-13 16:57:10 -04:00
Bruce MacDonald	7804b8fab9	validate api options fields from map (#711 )	2023-10-12 11:18:11 -04:00
Michael Yang	b599946b74	add format bytes	2023-10-11 14:08:23 -07:00
Bruce MacDonald	274d5a5fdf	optional parameter to not stream response (#639 ) * update streaming request accept header * add optional stream param to request bodies	2023-10-11 12:54:27 -04:00
Michael Yang	2cfffea02e	handle client proxy	2023-10-09 12:33:47 -07:00
Bruce MacDonald	2130c0708b	output type parsed from modelfile (#678 )	2023-10-05 14:58:04 -04:00
Bruce MacDonald	9e2de1bd2c	increase streaming buffer size (#692 )	2023-10-04 14:09:00 -04:00
Bruce MacDonald	1fbf3585d6	Relay default values to llama runner (#672 ) * include seed in params for llama.cpp server and remove empty filter for temp * relay default predict options to llama.cpp - reorganize options to match predict request for readability * omit empty stop --------- Co-authored-by: hallh <hallh@users.noreply.github.com>	2023-10-02 14:53:16 -04:00
Bruce MacDonald	a1b2d95f96	remove unused push/pull params (#650 )	2023-09-29 17:27:19 -04:00
Michael Yang	f40b3de758	use int64 consistently	2023-09-28 11:07:24 -07:00
Patrick Devine	8efbc5df55	DRAFT: add a simple python client to access ollama (#522 )	2023-09-14 16:37:38 -07:00
Bruce MacDonald	f221637053	first pass at linux gpu support (#454 ) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-12 11:04:35 -04:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	0f541a0367	s/ListResponseModel/ModelResponse/	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Michael Yang	982c535428	Merge pull request #428 from jmorganca/mxyng/upload-chunks update upload chunks	2023-08-30 07:47:17 -07:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	246dc65417	loosen http status code checks	2023-08-28 18:34:53 -04:00
Jeffrey Morgan	22ab7f5f88	default host to `127.0.0.1`, fixes #424	2023-08-26 11:59:28 -07:00
Michael Yang	2c7f956b38	add version	2023-08-22 09:40:58 -07:00
Michael Yang	f723bf0879	ignore nil map values	2023-08-17 15:50:46 -07:00
Jeffrey Morgan	54bb49a502	parse protocol for `OLLAMA_HOST`	2023-08-17 18:20:44 -04:00
Jeffrey Morgan	5ee6116420	set default `OLLAMA_HOST` to `http://localhost:11434`	2023-08-16 12:22:59 -04:00
Blake Mizerany	67e593e355	cmd: support OLLAMA_CLIENT_HOST environment variable (#262 ) * cmd: support OLLAMA_HOST environment variable This commit adds support for the OLLAMA_HOST environment variable. This variable can be used to specify the host to which the client should connect. This is useful when the client is running somewhere other than the host where the server is running. The new api.FromEnv function is used to read configure clients from the environment. Clients wishing to use the environment variable being consistent with the Ollama CLI can use this new function. * Update api/client.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update api/client.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-16 11:03:48 -04:00
Michael Yang	f27bc261cf	s/parmeter/parameter/	2023-08-10 16:26:06 -07:00
Michael Yang	81d8d7b73f	fix could not convert int	2023-08-10 16:24:17 -07:00
Patrick Devine	be989d89d1	Token auth (#314 )	2023-08-10 11:34:25 -07:00
Bruce MacDonald	4b3507f036	embeddings endpoint Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-10 11:45:57 -04:00
Bruce MacDonald	7a5f3616fd	embed text document in modelfile	2023-08-09 10:26:19 -04:00
Bruce MacDonald	21ddcaa1f1	pr comments - default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload	2023-08-08 13:49:37 -04:00
Michael Yang	f2074ed4c0	Merge pull request #306 from jmorganca/default-keep-system automatically set num_keep if num_keep < 0	2023-08-08 09:25:34 -07:00
Jeffrey Morgan	8713ac23a8	allow overriding `template` and `system` in `/api/generate` Fixes #297 Fixes #296	2023-08-08 00:55:34 -04:00
Michael Yang	4dc5b117dd	automatically set num_keep if num_keep < 0 num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions	2023-08-07 16:19:12 -07:00
Michael Yang	b9f4d67554	configurable rope frequency parameters	2023-08-03 22:11:58 -07:00
Bruce MacDonald	8b1e791820	allow specifying zero values in modelfile	2023-08-02 17:07:53 -04:00
Bruce MacDonald	8f8b6288ac	check server is running before running command	2023-08-02 10:51:23 -04:00
Bruce MacDonald	765994362c	use head to check heartbeat	2023-08-01 14:50:38 -04:00
Bruce MacDonald	1c5a8770ee	read runner parameter options from map - read runner options from map to see what was specified explicitly and overwrite zero values	2023-08-01 13:38:19 -04:00
Jeffrey Morgan	528bafa585	cache loaded model	2023-08-01 11:24:18 -04:00
Bruce MacDonald	e72fe7945f	check server is running before running command	2023-07-31 16:25:57 -04:00
Bruce MacDonald	184ad8f057	allow specifying stop conditions in modelfile	2023-07-28 11:02:04 -04:00
Jeffrey Morgan	822a0e36eb	lower batch size to 512	2023-07-28 10:56:21 -04:00
Michael Yang	fadf75f99d	add stop conditions	2023-07-27 17:00:47 -07:00
Michael Yang	ad3a7d0e2c	add NumGQA	2023-07-27 14:05:11 -07:00
Jeffrey Morgan	688661ab9b	increase default batch size to 1024	2023-07-27 16:51:01 -04:00
Michael Yang	cca61181cb	sample metrics	2023-07-27 09:31:44 -07:00
Michael Yang	c490416189	lock on llm.lock(); decrease batch size	2023-07-27 09:31:44 -07:00
Michael Yang	f62a882760	add session expiration	2023-07-27 09:31:44 -07:00
Michael Yang	3003fc03fc	update predict code	2023-07-27 09:31:44 -07:00
Michael Yang	32aec66e6a	add load duration	2023-07-27 09:31:44 -07:00

1 2 3 4

191 commits