ollama

Author	SHA1	Message	Date
Jeffrey Morgan	17678b7225	Restore system prompt on requests and default `num_keep` to `0`	2023-11-03 13:25:25 -07:00
Jeffrey Morgan	06589a3b30	Set `NumKeep` to `4` by default (#982 )	2023-11-02 17:26:11 -07:00
Michael Yang	1fd511e661	Merge pull request #975 from jmorganca/mxyng/downloads update downloads to use retry wrapper	2023-11-02 16:12:48 -07:00
Michael Yang	6db3691b8f	update default NumKeep	2023-11-02 15:47:35 -07:00
Michael Yang	60bb3c03a1	use http.Method	2023-11-02 13:12:45 -07:00
Bruce MacDonald	5c3491f425	allow for a configurable ollama model storage directory (#897 ) * allow for a configurable ollama models directory - set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored - update docs Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com> Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com> Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com> Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>	2023-10-27 10:19:59 -04:00
Michael Yang	28c3f288e2	client: fix trailing slash	2023-10-26 11:09:38 -07:00
Michael Yang	459f4a7889	fix: ollama host for hostname	2023-10-20 11:32:41 -07:00
Bruce MacDonald	fe6f3b48f7	do not reload the running llm when runtime params change (#840 ) - only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context	2023-10-19 10:39:58 -04:00
Michael Yang	92189a5855	fix memory check	2023-10-13 14:47:29 -07:00
Bruce MacDonald	6fe178134d	improve api error handling (#781 ) - remove new lines from llama.cpp error messages relayed to client - check api option types and return error on wrong type - change num layers from 95% VRAM to 92% VRAM	2023-10-13 16:57:10 -04:00
Bruce MacDonald	7804b8fab9	validate api options fields from map (#711 )	2023-10-12 11:18:11 -04:00
Michael Yang	b599946b74	add format bytes	2023-10-11 14:08:23 -07:00
Bruce MacDonald	274d5a5fdf	optional parameter to not stream response (#639 ) * update streaming request accept header * add optional stream param to request bodies	2023-10-11 12:54:27 -04:00
Michael Yang	2cfffea02e	handle client proxy	2023-10-09 12:33:47 -07:00
Bruce MacDonald	2130c0708b	output type parsed from modelfile (#678 )	2023-10-05 14:58:04 -04:00
Bruce MacDonald	9e2de1bd2c	increase streaming buffer size (#692 )	2023-10-04 14:09:00 -04:00
Bruce MacDonald	1fbf3585d6	Relay default values to llama runner (#672 ) * include seed in params for llama.cpp server and remove empty filter for temp * relay default predict options to llama.cpp - reorganize options to match predict request for readability * omit empty stop --------- Co-authored-by: hallh <hallh@users.noreply.github.com>	2023-10-02 14:53:16 -04:00
Bruce MacDonald	a1b2d95f96	remove unused push/pull params (#650 )	2023-09-29 17:27:19 -04:00
Michael Yang	f40b3de758	use int64 consistently	2023-09-28 11:07:24 -07:00
Patrick Devine	8efbc5df55	DRAFT: add a simple python client to access ollama (#522 )	2023-09-14 16:37:38 -07:00
Bruce MacDonald	f221637053	first pass at linux gpu support (#454 ) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-12 11:04:35 -04:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	0f541a0367	s/ListResponseModel/ModelResponse/	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Michael Yang	982c535428	Merge pull request #428 from jmorganca/mxyng/upload-chunks update upload chunks	2023-08-30 07:47:17 -07:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	246dc65417	loosen http status code checks	2023-08-28 18:34:53 -04:00
Jeffrey Morgan	22ab7f5f88	default host to `127.0.0.1`, fixes #424	2023-08-26 11:59:28 -07:00
Michael Yang	2c7f956b38	add version	2023-08-22 09:40:58 -07:00
Michael Yang	f723bf0879	ignore nil map values	2023-08-17 15:50:46 -07:00
Jeffrey Morgan	54bb49a502	parse protocol for `OLLAMA_HOST`	2023-08-17 18:20:44 -04:00
Jeffrey Morgan	5ee6116420	set default `OLLAMA_HOST` to `http://localhost:11434`	2023-08-16 12:22:59 -04:00
Blake Mizerany	67e593e355	cmd: support OLLAMA_CLIENT_HOST environment variable (#262 ) * cmd: support OLLAMA_HOST environment variable This commit adds support for the OLLAMA_HOST environment variable. This variable can be used to specify the host to which the client should connect. This is useful when the client is running somewhere other than the host where the server is running. The new api.FromEnv function is used to read configure clients from the environment. Clients wishing to use the environment variable being consistent with the Ollama CLI can use this new function. * Update api/client.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update api/client.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-16 11:03:48 -04:00
Michael Yang	f27bc261cf	s/parmeter/parameter/	2023-08-10 16:26:06 -07:00
Michael Yang	81d8d7b73f	fix could not convert int	2023-08-10 16:24:17 -07:00
Patrick Devine	be989d89d1	Token auth (#314 )	2023-08-10 11:34:25 -07:00
Bruce MacDonald	4b3507f036	embeddings endpoint Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-10 11:45:57 -04:00
Bruce MacDonald	7a5f3616fd	embed text document in modelfile	2023-08-09 10:26:19 -04:00
Bruce MacDonald	21ddcaa1f1	pr comments - default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload	2023-08-08 13:49:37 -04:00
Michael Yang	f2074ed4c0	Merge pull request #306 from jmorganca/default-keep-system automatically set num_keep if num_keep < 0	2023-08-08 09:25:34 -07:00
Jeffrey Morgan	8713ac23a8	allow overriding `template` and `system` in `/api/generate` Fixes #297 Fixes #296	2023-08-08 00:55:34 -04:00
Michael Yang	4dc5b117dd	automatically set num_keep if num_keep < 0 num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions	2023-08-07 16:19:12 -07:00
Michael Yang	b9f4d67554	configurable rope frequency parameters	2023-08-03 22:11:58 -07:00
Bruce MacDonald	8b1e791820	allow specifying zero values in modelfile	2023-08-02 17:07:53 -04:00
Bruce MacDonald	8f8b6288ac	check server is running before running command	2023-08-02 10:51:23 -04:00
Bruce MacDonald	765994362c	use head to check heartbeat	2023-08-01 14:50:38 -04:00
Bruce MacDonald	1c5a8770ee	read runner parameter options from map - read runner options from map to see what was specified explicitly and overwrite zero values	2023-08-01 13:38:19 -04:00
Jeffrey Morgan	528bafa585	cache loaded model	2023-08-01 11:24:18 -04:00
Bruce MacDonald	e72fe7945f	check server is running before running command	2023-07-31 16:25:57 -04:00
Bruce MacDonald	184ad8f057	allow specifying stop conditions in modelfile	2023-07-28 11:02:04 -04:00
Jeffrey Morgan	822a0e36eb	lower batch size to 512	2023-07-28 10:56:21 -04:00
Michael Yang	fadf75f99d	add stop conditions	2023-07-27 17:00:47 -07:00
Michael Yang	ad3a7d0e2c	add NumGQA	2023-07-27 14:05:11 -07:00
Jeffrey Morgan	688661ab9b	increase default batch size to 1024	2023-07-27 16:51:01 -04:00
Michael Yang	cca61181cb	sample metrics	2023-07-27 09:31:44 -07:00
Michael Yang	c490416189	lock on llm.lock(); decrease batch size	2023-07-27 09:31:44 -07:00
Michael Yang	f62a882760	add session expiration	2023-07-27 09:31:44 -07:00
Michael Yang	3003fc03fc	update predict code	2023-07-27 09:31:44 -07:00
Michael Yang	32aec66e6a	add load duration	2023-07-27 09:31:44 -07:00
Michael Yang	35af37a2cb	session id	2023-07-27 09:31:44 -07:00
Bruce MacDonald	4c1caa3733	download models when creating from modelfile	2023-07-25 14:25:13 -04:00
Bruce MacDonald	536028c35a	better error message when model not found on pull	2023-07-24 17:48:17 -04:00
Patrick Devine	4cb42ca55e	add copy command (#191 )	2023-07-24 11:27:28 -04:00
Patrick Devine	6d6b0d3321	change error handler behavior and fix error when a model isn't found (#173 )	2023-07-21 23:02:12 -07:00
Patrick Devine	9f6e97865c	allow pushing/pulling to insecure registries (#157 )	2023-07-21 15:42:19 -07:00
Bruce MacDonald	7ba1308595	Merge pull request #147 from jmorganca/brucemacd/cli-err-display Improve CLI error display	2023-07-21 16:10:19 +02:00
Patrick Devine	e7a393de54	add rm command for models (#151 )	2023-07-20 16:09:23 -07:00
Michael Yang	1f27d7f1b8	fix stream errors	2023-07-20 12:12:08 -07:00
Bruce MacDonald	ebaa33ac28	display gin api errors in cli	2023-07-20 20:45:12 +02:00
Michael Yang	68df36ae50	fix pull 0 bytes on completed layer	2023-07-18 19:38:11 -07:00
Patrick Devine	5bea29f610	add new list command (#97 )	2023-07-18 09:09:45 -07:00
Patrick Devine	2fb52261ad	basic distribution w/ push/pull (#78 ) * basic distribution w/ push/pull * add the parser * add create, pull, and push * changes to the parser, FROM line, and fix commands * mkdirp new manifest directories * make `blobs` directory if it does not exist * fix go warnings * add progressbar for model pulls * move model struct --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2023-07-16 17:02:22 -07:00
Michael Yang	965f9ad033	Merge pull request #77 from jmorganca/mem continue conversation	2023-07-14 14:57:42 -07:00
Michael Yang	5fefaa5d4d	fix typo	2023-07-14 10:47:18 -07:00
Michael Yang	1775647f76	continue conversation feed responses back into the llm	2023-07-13 17:13:00 -07:00
Michael Yang	05e08d2310	return more info in generate response	2023-07-13 09:37:32 -07:00
Michael Yang	e243329e2e	check api status	2023-07-11 13:42:05 -07:00
Michael Yang	fd4792ec56	call llama.cpp directly from go	2023-07-11 11:59:18 -07:00
Jeffrey Morgan	a3ec1ec2a0	consistent error handling for pull and generate	2023-07-10 21:34:15 -07:00
Michael Yang	edba935d67	return error in generate response	2023-07-10 13:30:10 -07:00
Bruce MacDonald	2d49197b3b	increase default model size to 512	2023-07-10 21:24:41 +02:00
Bruce MacDonald	f5e2e150b8	allow overriding default generate options	2023-07-10 20:58:02 +02:00
Bruce MacDonald	f533f85d44	pr feedback - move error check to api client pull - simplify error check in generate - return nil on any pull error	2023-07-07 17:12:02 -04:00
Bruce MacDonald	61dd87bd90	if directory cannot be resolved, do not fail	2023-07-07 15:27:43 -04:00
Michael Yang	303982b56e	fix run generate	2023-07-07 11:36:29 -07:00
Patrick Devine	3f1b7177f2	pass model and predict options	2023-07-07 09:34:05 -07:00
Michael Yang	291bb97e3d	client request options	2023-07-06 17:08:28 -07:00
Michael Yang	b0e63bfb4c	simplify api client	2023-07-06 17:07:40 -07:00
Michael Yang	c4b9e84945	progress	2023-07-06 17:07:40 -07:00
Michael Yang	3d6009aae3	run prompts	2023-07-06 17:07:40 -07:00
Michael Yang	0637632258	simple pull response	2023-07-06 16:34:44 -04:00
Michael Yang	dd960d1d5e	update generate response	2023-07-06 16:34:44 -04:00
Bruce MacDonald	c9f45abef3	resumable downloads	2023-07-06 16:34:44 -04:00
Bruce MacDonald	7cf5905063	display pull progress	2023-07-06 16:34:44 -04:00
Michael Yang	5079282120	tcp socket	2023-07-06 16:34:44 -04:00
Michael Yang	68e6b4550c	use prompt templates	2023-07-06 16:34:44 -04:00
Bruce MacDonald	a6494f8211	pull models	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	fd962a36e5	client updates	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	6093a88c1a	add llama.cpp go bindings	2023-07-06 16:34:44 -04:00

1 2 3 4

151 commits