ollama

Author	SHA1	Message	Date
Michael Yang	4dcceeffb7	let the template do the work	2023-10-18 13:12:00 -07:00
Michael Yang	019e4a4558	image: show parameters	2023-10-18 13:12:00 -07:00
Michael Yang	8299bf76ed	model: native gotemplate adapter template	2023-10-17 15:28:38 -07:00
Michael Yang	ee4979e510	show: no template system if empty	2023-10-17 15:25:43 -07:00
Bruce MacDonald	a0c3e989de	deprecate modelfile embed command (#759 )	2023-10-16 11:07:37 -04:00
Michael Yang	f6e98334e4	handle upstream proxies	2023-10-09 11:42:36 -07:00
Bruce MacDonald	d6786f2945	add feedback for reading model metadata (#722 )	2023-10-06 16:05:32 -04:00
Michael Yang	8544edca21	parallel chunked downloads	2023-10-06 12:56:43 -07:00
Bruce MacDonald	2130c0708b	output type parsed from modelfile (#678 )	2023-10-05 14:58:04 -04:00
Michael Yang	9333b0cc82	Merge pull request #612 from jmorganca/mxyng/prune-empty-directories prune empty directories	2023-09-29 11:23:39 -07:00
Michael Yang	f40b3de758	use int64 consistently	2023-09-28 11:07:24 -07:00
Michael Yang	8608eb4760	prune empty directories	2023-09-27 10:58:09 -07:00
Bruce MacDonald	4cba75efc5	remove tmp directories created by previous servers (#559 ) * remove tmp directories created by previous servers * clean up on server stop * Update routes.go * Update server/routes.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * create top-level temp ollama dir * check file exists before creating --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-21 20:38:49 +01:00
Michael Yang	499e9007a5	pick chunksize based on location	2023-09-20 11:10:24 -07:00
Michael Yang	a5520bfb42	fix build	2023-09-19 10:42:24 -07:00
Michael Yang	b58d5d16b0	fix mkdir on windows	2023-09-19 09:41:13 -07:00
Patrick Devine	24580df958	only add a layer if there is actual data (#535 )	2023-09-18 13:47:45 -07:00
Michael Yang	daa4f096f9	set request.ContentLength This informs the HTTP client the content length is known and disables chunked Transfer-Encoding	2023-09-14 13:32:44 -07:00
Michael Yang	e6881cabd0	remove unused	2023-09-13 14:48:33 -07:00
Michael Yang	0c5a454361	fix model type for 70b	2023-09-12 15:12:59 -07:00
Michael Yang	7dee25a07f	fix falcon decode get model and file type from bin file	2023-09-12 12:34:53 -07:00
Patrick Devine	e7e91cd71c	add autoprune to remove unused layers (#491 )	2023-09-11 11:46:35 -07:00
Jeffrey Morgan	3920e15386	add model format to config layer (#497 )	2023-09-09 17:53:44 -04:00
Michael Yang	de227b620f	fix nil pointer dereference	2023-09-07 17:24:31 -07:00
Michael Yang	738fe9c4aa	Merge pull request #486 from jmorganca/mxyng/fix-push fix: retry push on expired token	2023-09-07 13:58:34 -07:00
Michael Yang	f0f4943577	fix get auth token	2023-09-07 12:01:56 -07:00
Bruce MacDonald	09dd2aeff9	GGUF support (#441 )	2023-09-07 13:55:37 -04:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	06ef90c051	fix parameter inheritence parameters are not inherited because they are processed differently from other layer. fix this by explicitly merging the inherited params into the new params. parameter values defined in the new modelfile will override those defined in the inherited modelfile. array lists are replaced instead of appended	2023-09-05 11:40:20 -07:00
Michael Yang	e9f6df7dca	use slices.DeleteFunc	2023-09-05 09:56:59 -07:00
Quinn Slack	62d29b2157	do not HTML-escape prompt The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `<h1>hello` before being passed to the LLM. The included test case passes, but before the code change, it failed: ``` --- FAIL: TestModelPrompt images_test.go:21: got "a<h1>b", want "a<h1>b" ```	2023-09-01 17:16:38 -05:00
Michael Yang	1c8fd627ad	windows: fix create modelfile	2023-08-31 09:47:10 -04:00
Michael Yang	ae950b00f1	windows: fix delete	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Quinn Slack	f4432e1dba	treat stop as stop sequences, not exact tokens (#442 ) The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list. Fixes https://github.com/jmorganca/ollama/issues/295.	2023-08-30 11:53:42 -04:00
Michael Yang	982c535428	Merge pull request #428 from jmorganca/mxyng/upload-chunks update upload chunks	2023-08-30 07:47:17 -07:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	16b06699fd	remove unused parameter	2023-08-28 18:35:18 -04:00
Michael Yang	246dc65417	loosen http status code checks	2023-08-28 18:34:53 -04:00
Michael Yang	59734ca24d	set default template	2023-08-26 12:20:48 -07:00
Michael Yang	32d1a00017	remove unused requestContextKey	2023-08-22 10:49:54 -07:00
Michael Yang	04e2128273	move upload funcs to upload.go	2023-08-22 10:49:53 -07:00
Michael Yang	2cc634689b	use url.URL	2023-08-22 10:49:07 -07:00
Michael Yang	9ec7e37534	Merge pull request #392 from jmorganca/mxyng/version add version	2023-08-22 09:50:25 -07:00
Michael Yang	2c7f956b38	add version	2023-08-22 09:40:58 -07:00
Jeffrey Morgan	a9f6c56652	fix `FROM` instruction erroring when referring to a file	2023-08-22 09:39:42 -07:00
Ryan Baker	0a892419ad	Strip protocol from model path (#377 )	2023-08-21 21:56:56 -07:00
Michael Yang	3b49315f97	retry on unauthorized chunk push The token printed for authorized requests has a lifetime of 1h. If an upload exceeds 1h, a chunk push will fail since the token is created on a "start upload" request. This replaces the Pipe with SectionReader which is simpler and implements Seek, a requirement for makeRequestWithRetry. This is slightly worse than using a Pipe since the progress update is directly tied to the chunk size instead of controlled separately.	2023-08-18 11:23:47 -07:00
Michael Yang	7eda70f23b	copy metadata from source	2023-08-17 21:55:25 -07:00
Michael Yang	086449b6c7	fmt	2023-08-17 15:32:31 -07:00
Michael Yang	3cbc6a5c01	fix push manifest	2023-08-17 15:28:12 -07:00
Michael Yang	a894cc792d	model and file type as strings	2023-08-17 12:08:04 -07:00
Michael Yang	b963a83559	Merge pull request #364 from jmorganca/chunked-uploads reimplement chunked uploads	2023-08-17 09:58:51 -07:00
Michael Yang	bf6688abe6	Merge pull request #360 from jmorganca/fix-request-copies Fix request copies	2023-08-17 09:58:42 -07:00
Bruce MacDonald	6005b157c2	retry download on network errors	2023-08-17 10:31:45 -04:00
Michael Yang	5dfe91be8b	reimplement chunked uploads	2023-08-16 14:50:24 -07:00
Michael Yang	9f944c00f1	push: retry on unauthorized	2023-08-16 11:35:33 -07:00
Michael Yang	56e87cecb1	images: remove body copies	2023-08-16 10:30:41 -07:00
Michael Yang	5d9a4cd251	Merge pull request #348 from jmorganca/cross-repo-mount cross repo blob mount	2023-08-16 09:20:36 -07:00
Bruce MacDonald	1deb35ca64	use loaded llm for generating model file embeddings	2023-08-15 16:12:02 -03:00
Bruce MacDonald	e2de886831	do not regenerate embeddings	2023-08-15 16:10:22 -03:00
Bruce MacDonald	f0d7c2f5ea	retry download on network errors	2023-08-15 15:07:19 -03:00
Bruce MacDonald	326de48930	use loaded llm for embeddings	2023-08-15 10:50:54 -03:00
Bruce MacDonald	18f2cb0472	dont log fatal	2023-08-15 10:39:59 -03:00
Michael Yang	e26085b921	close open files	2023-08-14 16:08:06 -07:00
Michael Yang	f594c8eb91	cross repo mount	2023-08-14 15:07:35 -07:00
Bruce MacDonald	2c8b680b03	use file info for embeddings cache	2023-08-14 12:11:04 -03:00
Bruce MacDonald	99b6b60085	use model bin digest for embed digest	2023-08-14 11:57:12 -03:00
Bruce MacDonald	e9a9580bdd	do not regenerate embeddings - re-use previously evaluated embeddings when possible - change embeddings digest identifier to be based on model name and embedded file path	2023-08-14 10:34:17 -03:00
Patrick Devine	d9cf18e28d	add maximum retries when pushing (#334 )	2023-08-11 15:41:55 -07:00
Michael Yang	6517bcc53c	Merge pull request #290 from jmorganca/add-adapter-layers implement loading ggml lora adapters through the modelfile	2023-08-10 17:23:01 -07:00
Michael Yang	6a6828bddf	Merge pull request #167 from jmorganca/decode-ggml partial decode ggml bin for more info	2023-08-10 17:22:40 -07:00
Patrick Devine	be989d89d1	Token auth (#314 )	2023-08-10 11:34:25 -07:00
Michael Yang	6de5d032e1	implement loading ggml lora adapters through the modelfile	2023-08-10 09:23:39 -07:00
Michael Yang	fccf8d179f	partial decode ggml bin for more info	2023-08-10 09:23:10 -07:00
Bruce MacDonald	984c9c628c	fix embeddings invalid values	2023-08-09 16:50:53 -04:00
Bruce MacDonald	ac971c56d1	Update images.go	2023-08-09 11:31:54 -04:00
Bruce MacDonald	868e3b31c7	allow for concurrent pulls of the same files	2023-08-09 11:31:54 -04:00
Bruce MacDonald	1bee2347be	pr feedback - defer closing llm on embedding - do not override licenses - remove debugging print line - reformat model file docs	2023-08-08 17:01:37 -04:00
Bruce MacDonald	884d78ceb3	allow embedding from model binary	2023-08-08 14:38:57 -04:00
Bruce MacDonald	21ddcaa1f1	pr comments - default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload	2023-08-08 13:49:37 -04:00
Bruce MacDonald	a6f6d18f83	embed text document in modelfile	2023-08-08 11:27:17 -04:00
Jeffrey Morgan	8713ac23a8	allow overriding `template` and `system` in `/api/generate` Fixes #297 Fixes #296	2023-08-08 00:55:34 -04:00
Michael Yang	a71ff3f6a2	use a pipe to push to registry with progress switch to a monolithic upload instead of a chunk upload through a pipe to report progress	2023-08-03 10:37:13 -07:00
Bruce MacDonald	1c5a8770ee	read runner parameter options from map - read runner options from map to see what was specified explicitly and overwrite zero values	2023-08-01 13:38:19 -04:00
Bruce MacDonald	daa0d1de7a	allow specifying zero values in modelfile	2023-08-01 13:37:50 -04:00
Jeffrey Morgan	528bafa585	cache loaded model	2023-08-01 11:24:18 -04:00
Michael Yang	872011630a	fix license	2023-07-31 21:46:48 -07:00
Michael Yang	203fdbc4b8	check err	2023-07-31 21:46:48 -07:00
Michael Yang	70e0ab6b3d	remove unnecessary fmt.Sprintf	2023-07-31 21:46:47 -07:00
Jeffrey Morgan	9968153729	fix Go warnings	2023-07-31 21:37:40 -04:00
Michael Yang	eadee46840	Merge pull request #236 from jmorganca/check-os-walk check os.Walk err	2023-07-28 14:14:21 -07:00
Michael Yang	bd58528fbd	check os.Walk err	2023-07-28 12:15:31 -07:00
Michael Yang	c5e447a359	remove io/ioutil import ioutil is deprecated	2023-07-28 12:06:03 -07:00
Bruce MacDonald	f5cbcb08e6	specify stop params separately	2023-07-28 11:29:00 -04:00
Bruce MacDonald	184ad8f057	allow specifying stop conditions in modelfile	2023-07-28 11:02:04 -04:00
Bruce MacDonald	1ac38ec89c	improve modelfile docs	2023-07-27 15:13:04 -04:00
Bruce MacDonald	4c1caa3733	download models when creating from modelfile	2023-07-25 14:25:13 -04:00
Bruce MacDonald	07ed69bc37	remove reduandant err var	2023-07-25 10:30:14 -04:00
Bruce MacDonald	536028c35a	better error message when model not found on pull	2023-07-24 17:48:17 -04:00

1 2 3 4

187 commits