ollama

Author	SHA1	Message	Date
Jeffrey Morgan	cdddd3df65	add `format` to example python client	2023-11-10 10:22:21 -08:00
Jeffrey Morgan	5cba29b9d6	JSON mode: add `"format" as an api parameter (#1051 ) * add `"format": "json"` as an API parameter --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2023-11-09 16:44:02 -08:00
Bruce MacDonald	a49d6acc1e	add a complete /generate options example (#1035 )	2023-11-08 16:44:36 -08:00
Bruce MacDonald	ec2a31e9b3	support raw generation requests (#952 ) - add the optional `raw` generate request parameter to bypass prompt formatting and response context -add raw request to docs	2023-11-08 14:05:02 -08:00
Jeffrey Morgan	17678b7225	Restore system prompt on requests and default `num_keep` to `0`	2023-11-03 13:25:25 -07:00
Jeffrey Morgan	06589a3b30	Set `NumKeep` to `4` by default (#982 )	2023-11-02 17:26:11 -07:00
Michael Yang	1fd511e661	Merge pull request #975 from jmorganca/mxyng/downloads update downloads to use retry wrapper	2023-11-02 16:12:48 -07:00
Michael Yang	6db3691b8f	update default NumKeep	2023-11-02 15:47:35 -07:00
Michael Yang	60bb3c03a1	use http.Method	2023-11-02 13:12:45 -07:00
Bruce MacDonald	5c3491f425	allow for a configurable ollama model storage directory (#897 ) * allow for a configurable ollama models directory - set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored - update docs Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com> Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com> Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com> Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>	2023-10-27 10:19:59 -04:00
Michael Yang	28c3f288e2	client: fix trailing slash	2023-10-26 11:09:38 -07:00
Michael Yang	459f4a7889	fix: ollama host for hostname	2023-10-20 11:32:41 -07:00
Bruce MacDonald	fe6f3b48f7	do not reload the running llm when runtime params change (#840 ) - only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context	2023-10-19 10:39:58 -04:00
Michael Yang	92189a5855	fix memory check	2023-10-13 14:47:29 -07:00
Bruce MacDonald	6fe178134d	improve api error handling (#781 ) - remove new lines from llama.cpp error messages relayed to client - check api option types and return error on wrong type - change num layers from 95% VRAM to 92% VRAM	2023-10-13 16:57:10 -04:00
Bruce MacDonald	7804b8fab9	validate api options fields from map (#711 )	2023-10-12 11:18:11 -04:00
Michael Yang	b599946b74	add format bytes	2023-10-11 14:08:23 -07:00
Bruce MacDonald	274d5a5fdf	optional parameter to not stream response (#639 ) * update streaming request accept header * add optional stream param to request bodies	2023-10-11 12:54:27 -04:00
Michael Yang	2cfffea02e	handle client proxy	2023-10-09 12:33:47 -07:00
Bruce MacDonald	2130c0708b	output type parsed from modelfile (#678 )	2023-10-05 14:58:04 -04:00
Bruce MacDonald	9e2de1bd2c	increase streaming buffer size (#692 )	2023-10-04 14:09:00 -04:00
Bruce MacDonald	1fbf3585d6	Relay default values to llama runner (#672 ) * include seed in params for llama.cpp server and remove empty filter for temp * relay default predict options to llama.cpp - reorganize options to match predict request for readability * omit empty stop --------- Co-authored-by: hallh <hallh@users.noreply.github.com>	2023-10-02 14:53:16 -04:00
Bruce MacDonald	a1b2d95f96	remove unused push/pull params (#650 )	2023-09-29 17:27:19 -04:00
Michael Yang	f40b3de758	use int64 consistently	2023-09-28 11:07:24 -07:00
Patrick Devine	8efbc5df55	DRAFT: add a simple python client to access ollama (#522 )	2023-09-14 16:37:38 -07:00
Bruce MacDonald	f221637053	first pass at linux gpu support (#454 ) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-12 11:04:35 -04:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	0f541a0367	s/ListResponseModel/ModelResponse/	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Michael Yang	982c535428	Merge pull request #428 from jmorganca/mxyng/upload-chunks update upload chunks	2023-08-30 07:47:17 -07:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	246dc65417	loosen http status code checks	2023-08-28 18:34:53 -04:00
Jeffrey Morgan	22ab7f5f88	default host to `127.0.0.1`, fixes #424	2023-08-26 11:59:28 -07:00
Michael Yang	2c7f956b38	add version	2023-08-22 09:40:58 -07:00
Michael Yang	f723bf0879	ignore nil map values	2023-08-17 15:50:46 -07:00
Jeffrey Morgan	54bb49a502	parse protocol for `OLLAMA_HOST`	2023-08-17 18:20:44 -04:00
Jeffrey Morgan	5ee6116420	set default `OLLAMA_HOST` to `http://localhost:11434`	2023-08-16 12:22:59 -04:00
Blake Mizerany	67e593e355	cmd: support OLLAMA_CLIENT_HOST environment variable (#262 ) * cmd: support OLLAMA_HOST environment variable This commit adds support for the OLLAMA_HOST environment variable. This variable can be used to specify the host to which the client should connect. This is useful when the client is running somewhere other than the host where the server is running. The new api.FromEnv function is used to read configure clients from the environment. Clients wishing to use the environment variable being consistent with the Ollama CLI can use this new function. * Update api/client.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update api/client.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-16 11:03:48 -04:00
Michael Yang	f27bc261cf	s/parmeter/parameter/	2023-08-10 16:26:06 -07:00
Michael Yang	81d8d7b73f	fix could not convert int	2023-08-10 16:24:17 -07:00
Patrick Devine	be989d89d1	Token auth (#314 )	2023-08-10 11:34:25 -07:00
Bruce MacDonald	4b3507f036	embeddings endpoint Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-10 11:45:57 -04:00
Bruce MacDonald	7a5f3616fd	embed text document in modelfile	2023-08-09 10:26:19 -04:00
Bruce MacDonald	21ddcaa1f1	pr comments - default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload	2023-08-08 13:49:37 -04:00
Michael Yang	f2074ed4c0	Merge pull request #306 from jmorganca/default-keep-system automatically set num_keep if num_keep < 0	2023-08-08 09:25:34 -07:00
Jeffrey Morgan	8713ac23a8	allow overriding `template` and `system` in `/api/generate` Fixes #297 Fixes #296	2023-08-08 00:55:34 -04:00
Michael Yang	4dc5b117dd	automatically set num_keep if num_keep < 0 num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions	2023-08-07 16:19:12 -07:00
Michael Yang	b9f4d67554	configurable rope frequency parameters	2023-08-03 22:11:58 -07:00
Bruce MacDonald	8b1e791820	allow specifying zero values in modelfile	2023-08-02 17:07:53 -04:00
Bruce MacDonald	8f8b6288ac	check server is running before running command	2023-08-02 10:51:23 -04:00

1 2 3

105 commits