ollama

Author	SHA1	Message	Date
Bruce MacDonald	7a0899d62d	chat api (#991 ) - update chat docs - add messages chat endpoint - remove deprecated context and template generate parameters from docs - context and template are still supported for the time being and will continue to work as expected - add partial response to chat history	2023-12-04 18:01:06 -05:00
Joshua Pham	bb80a597db	Fix adapter loading from SHA hash	2023-12-01 13:50:55 -05:00
Patrick Devine	cde31cb220	Allow setting parameters in the REPL (#1294 )	2023-11-29 09:56:42 -08:00
Bruce MacDonald	37d95157df	fix relative path on create (#1222 )	2023-11-21 15:43:17 -05:00
Jeffrey Morgan	02524a56ff	check retry for authorization error	2023-11-19 00:19:53 -05:00
Jeffrey Morgan	12e046f12a	remove unused function	2023-11-18 22:16:51 -05:00
Bruce MacDonald	0b19e24d81	only retry once on auth failure (#1175 )	2023-11-17 14:22:35 -05:00
Bruce MacDonald	4b3f4bc7d9	return failure details when unauthorized to push (#1131 ) Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2023-11-16 16:44:18 -05:00
Michael Yang	652d90e1c7	Update server/images.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2023-11-15 15:16:23 -08:00
Michael Yang	1901044b07	use checksum reference	2023-11-15 15:16:23 -08:00
Michael Yang	a07c935d34	ignore non blobs	2023-11-15 15:16:23 -08:00
Michael Yang	b0d14ed51c	refactor create model	2023-11-15 15:16:23 -08:00
Daniel Reis	7c438f2c53	Replaced method	2023-11-10 20:22:03 +00:00
Daniel Reis	6e46338d44	Reverting previous changes	2023-11-10 20:21:35 +00:00
Daniel Reis	d17730356a	Removed inline parse model path	2023-11-09 22:44:26 +00:00
Daniel Reis	32d79a6eea	Using 'GetShortTagname' method instead	2023-11-09 22:40:37 +00:00
Jeffrey Morgan	e21579a0f1	Restore system prompt on requests	2023-11-03 17:26:45 -07:00
Jeffrey Morgan	c50b01bc21	check `request.Context` for initial system prompt	2023-11-02 18:17:00 -07:00
Bruce MacDonald	b9dc875401	remove modelfile context deprecated in v0.0.7 (#974 )	2023-11-02 20:52:56 -04:00
Michael Yang	1fd511e661	Merge pull request #975 from jmorganca/mxyng/downloads update downloads to use retry wrapper	2023-11-02 16:12:48 -07:00
Jeffrey Morgan	1beb5645a9	only use system prompt if context is not provided (#978 )	2023-11-02 15:48:02 -07:00
Michael Yang	fe5a872444	fix upload	2023-11-02 13:25:58 -07:00
Michael Yang	d39709260f	download with retry	2023-11-02 13:16:11 -07:00
Michael Yang	60bb3c03a1	use http.Method	2023-11-02 13:12:45 -07:00
Michael Yang	4e09aab8b9	concurrent uploads	2023-10-27 17:07:33 -07:00
Bruce MacDonald	5c3491f425	allow for a configurable ollama model storage directory (#897 ) * allow for a configurable ollama models directory - set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored - update docs Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com> Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com> Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com> Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>	2023-10-27 10:19:59 -04:00
Michael Yang	846f593dbf	Merge pull request #828 from jmorganca/mxyng/template-parameters image: show parameters	2023-10-19 09:31:31 -07:00
Michael Yang	a19d47642e	models: rm workDir from CreateModel unused after removing EMBED	2023-10-19 09:21:04 -07:00
Bruce MacDonald	fe6f3b48f7	do not reload the running llm when runtime params change (#840 ) - only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context	2023-10-19 10:39:58 -04:00
Michael Yang	4dcceeffb7	let the template do the work	2023-10-18 13:12:00 -07:00
Michael Yang	019e4a4558	image: show parameters	2023-10-18 13:12:00 -07:00
Michael Yang	8299bf76ed	model: native gotemplate adapter template	2023-10-17 15:28:38 -07:00
Michael Yang	ee4979e510	show: no template system if empty	2023-10-17 15:25:43 -07:00
Bruce MacDonald	a0c3e989de	deprecate modelfile embed command (#759 )	2023-10-16 11:07:37 -04:00
Michael Yang	f6e98334e4	handle upstream proxies	2023-10-09 11:42:36 -07:00
Bruce MacDonald	d6786f2945	add feedback for reading model metadata (#722 )	2023-10-06 16:05:32 -04:00
Michael Yang	8544edca21	parallel chunked downloads	2023-10-06 12:56:43 -07:00
Bruce MacDonald	2130c0708b	output type parsed from modelfile (#678 )	2023-10-05 14:58:04 -04:00
Michael Yang	9333b0cc82	Merge pull request #612 from jmorganca/mxyng/prune-empty-directories prune empty directories	2023-09-29 11:23:39 -07:00
Michael Yang	f40b3de758	use int64 consistently	2023-09-28 11:07:24 -07:00
Michael Yang	8608eb4760	prune empty directories	2023-09-27 10:58:09 -07:00
Bruce MacDonald	4cba75efc5	remove tmp directories created by previous servers (#559 ) * remove tmp directories created by previous servers * clean up on server stop * Update routes.go * Update server/routes.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * create top-level temp ollama dir * check file exists before creating --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-21 20:38:49 +01:00
Michael Yang	499e9007a5	pick chunksize based on location	2023-09-20 11:10:24 -07:00
Michael Yang	a5520bfb42	fix build	2023-09-19 10:42:24 -07:00
Michael Yang	b58d5d16b0	fix mkdir on windows	2023-09-19 09:41:13 -07:00
Patrick Devine	24580df958	only add a layer if there is actual data (#535 )	2023-09-18 13:47:45 -07:00
Michael Yang	daa4f096f9	set request.ContentLength This informs the HTTP client the content length is known and disables chunked Transfer-Encoding	2023-09-14 13:32:44 -07:00
Michael Yang	e6881cabd0	remove unused	2023-09-13 14:48:33 -07:00
Michael Yang	0c5a454361	fix model type for 70b	2023-09-12 15:12:59 -07:00
Michael Yang	7dee25a07f	fix falcon decode get model and file type from bin file	2023-09-12 12:34:53 -07:00
Patrick Devine	e7e91cd71c	add autoprune to remove unused layers (#491 )	2023-09-11 11:46:35 -07:00
Jeffrey Morgan	3920e15386	add model format to config layer (#497 )	2023-09-09 17:53:44 -04:00
Michael Yang	de227b620f	fix nil pointer dereference	2023-09-07 17:24:31 -07:00
Michael Yang	738fe9c4aa	Merge pull request #486 from jmorganca/mxyng/fix-push fix: retry push on expired token	2023-09-07 13:58:34 -07:00
Michael Yang	f0f4943577	fix get auth token	2023-09-07 12:01:56 -07:00
Bruce MacDonald	09dd2aeff9	GGUF support (#441 )	2023-09-07 13:55:37 -04:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	06ef90c051	fix parameter inheritence parameters are not inherited because they are processed differently from other layer. fix this by explicitly merging the inherited params into the new params. parameter values defined in the new modelfile will override those defined in the inherited modelfile. array lists are replaced instead of appended	2023-09-05 11:40:20 -07:00
Michael Yang	e9f6df7dca	use slices.DeleteFunc	2023-09-05 09:56:59 -07:00
Quinn Slack	62d29b2157	do not HTML-escape prompt The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `<h1>hello` before being passed to the LLM. The included test case passes, but before the code change, it failed: ``` --- FAIL: TestModelPrompt images_test.go:21: got "a<h1>b", want "a<h1>b" ```	2023-09-01 17:16:38 -05:00
Michael Yang	1c8fd627ad	windows: fix create modelfile	2023-08-31 09:47:10 -04:00
Michael Yang	ae950b00f1	windows: fix delete	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Quinn Slack	f4432e1dba	treat stop as stop sequences, not exact tokens (#442 ) The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list. Fixes https://github.com/jmorganca/ollama/issues/295.	2023-08-30 11:53:42 -04:00
Michael Yang	982c535428	Merge pull request #428 from jmorganca/mxyng/upload-chunks update upload chunks	2023-08-30 07:47:17 -07:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	16b06699fd	remove unused parameter	2023-08-28 18:35:18 -04:00
Michael Yang	246dc65417	loosen http status code checks	2023-08-28 18:34:53 -04:00
Michael Yang	59734ca24d	set default template	2023-08-26 12:20:48 -07:00
Michael Yang	32d1a00017	remove unused requestContextKey	2023-08-22 10:49:54 -07:00
Michael Yang	04e2128273	move upload funcs to upload.go	2023-08-22 10:49:53 -07:00
Michael Yang	2cc634689b	use url.URL	2023-08-22 10:49:07 -07:00
Michael Yang	9ec7e37534	Merge pull request #392 from jmorganca/mxyng/version add version	2023-08-22 09:50:25 -07:00
Michael Yang	2c7f956b38	add version	2023-08-22 09:40:58 -07:00
Jeffrey Morgan	a9f6c56652	fix `FROM` instruction erroring when referring to a file	2023-08-22 09:39:42 -07:00
Ryan Baker	0a892419ad	Strip protocol from model path (#377 )	2023-08-21 21:56:56 -07:00
Michael Yang	3b49315f97	retry on unauthorized chunk push The token printed for authorized requests has a lifetime of 1h. If an upload exceeds 1h, a chunk push will fail since the token is created on a "start upload" request. This replaces the Pipe with SectionReader which is simpler and implements Seek, a requirement for makeRequestWithRetry. This is slightly worse than using a Pipe since the progress update is directly tied to the chunk size instead of controlled separately.	2023-08-18 11:23:47 -07:00
Michael Yang	7eda70f23b	copy metadata from source	2023-08-17 21:55:25 -07:00
Michael Yang	086449b6c7	fmt	2023-08-17 15:32:31 -07:00
Michael Yang	3cbc6a5c01	fix push manifest	2023-08-17 15:28:12 -07:00
Michael Yang	a894cc792d	model and file type as strings	2023-08-17 12:08:04 -07:00
Michael Yang	b963a83559	Merge pull request #364 from jmorganca/chunked-uploads reimplement chunked uploads	2023-08-17 09:58:51 -07:00
Michael Yang	bf6688abe6	Merge pull request #360 from jmorganca/fix-request-copies Fix request copies	2023-08-17 09:58:42 -07:00
Bruce MacDonald	6005b157c2	retry download on network errors	2023-08-17 10:31:45 -04:00
Michael Yang	5dfe91be8b	reimplement chunked uploads	2023-08-16 14:50:24 -07:00
Michael Yang	9f944c00f1	push: retry on unauthorized	2023-08-16 11:35:33 -07:00
Michael Yang	56e87cecb1	images: remove body copies	2023-08-16 10:30:41 -07:00
Michael Yang	5d9a4cd251	Merge pull request #348 from jmorganca/cross-repo-mount cross repo blob mount	2023-08-16 09:20:36 -07:00
Bruce MacDonald	1deb35ca64	use loaded llm for generating model file embeddings	2023-08-15 16:12:02 -03:00
Bruce MacDonald	e2de886831	do not regenerate embeddings	2023-08-15 16:10:22 -03:00
Bruce MacDonald	f0d7c2f5ea	retry download on network errors	2023-08-15 15:07:19 -03:00
Bruce MacDonald	326de48930	use loaded llm for embeddings	2023-08-15 10:50:54 -03:00
Bruce MacDonald	18f2cb0472	dont log fatal	2023-08-15 10:39:59 -03:00
Michael Yang	e26085b921	close open files	2023-08-14 16:08:06 -07:00
Michael Yang	f594c8eb91	cross repo mount	2023-08-14 15:07:35 -07:00
Bruce MacDonald	2c8b680b03	use file info for embeddings cache	2023-08-14 12:11:04 -03:00
Bruce MacDonald	99b6b60085	use model bin digest for embed digest	2023-08-14 11:57:12 -03:00
Bruce MacDonald	e9a9580bdd	do not regenerate embeddings - re-use previously evaluated embeddings when possible - change embeddings digest identifier to be based on model name and embedded file path	2023-08-14 10:34:17 -03:00
Patrick Devine	d9cf18e28d	add maximum retries when pushing (#334 )	2023-08-11 15:41:55 -07:00
Michael Yang	6517bcc53c	Merge pull request #290 from jmorganca/add-adapter-layers implement loading ggml lora adapters through the modelfile	2023-08-10 17:23:01 -07:00

1 2 3 4 5

216 commits