ollama

Author	SHA1	Message	Date
Bruce MacDonald	0cebc79cba	fix: allow importing a model from name reference (#3005 )	2024-03-08 12:27:47 -05:00
Jeffrey Morgan	fc06205971	Revert "adjust download and upload concurrency based on available bandwidth" (#2995 )	2024-03-07 18:10:16 -08:00
Daniel Hiltgen	6c5ccb11f9	Revamp ROCm support This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information.	2024-03-07 10:36:50 -08:00
Michael Yang	2e20110e50	Merge pull request #2221 from ollama/mxyng/up-down-ccy adjust download and upload concurrency based on available bandwidth	2024-03-07 09:27:33 -08:00
Patrick Devine	2c017ca441	Convert Safetensors to an Ollama model (#2824 )	2024-03-06 21:01:51 -08:00
Jeffrey Morgan	3b4bab3dc5	Fix embeddings load model behavior (#2848 )	2024-02-29 17:40:56 -08:00
Michael Yang	0e19476b56	prepend image tags (#2789 ) instead of appending image tags, prepend them - this generally produces better results	2024-02-29 11:30:14 -08:00
Michael Yang	084d846621	refactor	2024-02-21 13:42:48 -08:00
Michael Yang	6a4b994433	lint	2024-02-21 13:42:48 -08:00
Michael Yang	bea007deb7	use LimitGroup for uploads	2024-02-21 13:42:48 -08:00
Michael Yang	074934be03	adjust group limit based on download speed	2024-02-21 13:42:48 -08:00
Michael Yang	0de12368a0	add new LimitGroup for dynamic concurrency	2024-02-21 13:42:48 -08:00
Michael Yang	917bd61084	refactor download run	2024-02-21 13:42:46 -08:00
Jeffrey Morgan	287ba11500	better error message when calling `/api/generate` or `/api/chat` with embedding models	2024-02-20 21:53:45 -05:00
Jeffrey Morgan	63861f58cc	Support for `bert` and `nomic-bert` embedding models	2024-02-20 21:37:29 -05:00
Michael Yang	210b65268e	replace strings buffer with hasher (#2437 ) the buffered value is going into the hasher eventually so write directly to the hasher instead	2024-02-20 19:07:50 -05:00
Michael Yang	897b213468	use http.DefaultClient (#2530 ) default client already handles proxy	2024-02-20 18:34:47 -05:00
Bruce MacDonald	88622847c6	fix: chat system prompting overrides (#2542 )	2024-02-16 14:42:43 -05:00
Michael Yang	e43648afe5	rerefactor	2024-02-15 05:56:45 +00:00
Daniel Hiltgen	f397e0e988	Move hub auth out to new package	2024-02-15 05:56:45 +00:00
Jeffrey Morgan	48a273f80b	Fix issues with templating prompt in chat mode (#2460 )	2024-02-12 15:06:57 -08:00
Jeffrey Morgan	1f9078d6ae	Check image filetype in api handlers (#2467 )	2024-02-12 11:16:20 -08:00
Jeffrey Morgan	a0a199b108	Fix hanging issue when sending empty content (#2399 )	2024-02-07 19:30:33 -05:00
Jeffrey Morgan	453f572f83	Initial OpenAI `/v1/chat/completions` API compatibility (#2376 )	2024-02-07 17:24:29 -05:00
Michael Yang	e805ac1d59	fix response on token error	2024-02-07 11:05:49 -08:00
Michael Yang	bfbf2f7cf7	Merge pull request #2296 from ollama/mxyng/img-tags append image tags to user content	2024-02-01 13:16:59 -08:00
Michael Yang	3d6f48507a	structured debug prompt	2024-02-01 11:56:28 -08:00
Michael Yang	f3761405c8	use image id	2024-02-01 11:52:42 -08:00
Michael Yang	e49dc9f3d8	fix tests	2024-02-01 11:48:11 -08:00
Michael Yang	d125510b4b	remove image tags	2024-02-01 11:32:51 -08:00
Michael Yang	fb56988014	account for image projection in token count	2024-02-01 09:50:48 -08:00
Michael Yang	d046bee790	use llm.ImageData for chat	2024-01-31 19:18:25 -08:00
Jeffrey Morgan	f11bf0740b	use `llm.ImageData`	2024-01-31 19:13:48 -08:00
Michael Yang	8450bf66e6	trim images	2024-01-31 19:13:47 -08:00
Michael Yang	b4e11be8ef	append image tags to user content	2024-01-31 19:13:10 -08:00
Bruce MacDonald	a896079705	preserve last system message from modelfile (#2289 )	2024-01-31 21:45:01 -05:00
Michael Yang	8ac08a0eec	update slog handler options - consistent format by using text handler for debug and non-debug - truncate source file to just the file name	2024-01-31 15:15:00 -08:00
Michael Yang	c8b1f2369e	remove unnecessary parse raw	2024-01-30 17:00:53 -08:00
Bruce MacDonald	0632dff3f8	trim chat prompt based on llm context size (#1963 )	2024-01-30 15:59:29 -05:00
Jeffrey Morgan	f2245c7c77	print prompt with `OLLAMA_DEBUG=1` (#2245 )	2024-01-28 15:22:35 -08:00
Jeffrey Morgan	e4b9b72f2a	Do not repeat system prompt for chat templating (#2241 )	2024-01-28 14:15:56 -08:00
Patrick Devine	b5cf31b460	add keep_alive to generate/chat/embedding api endpoints (#2146 )	2024-01-26 14:28:02 -08:00
Michael Yang	9d3dcfd0ec	fix logging	2024-01-26 11:04:27 -08:00
Michael Yang	6e0ea5ecc8	Merge pull request #1916 from ollama/mxyng/inactivity-monitor download: add inactivity monitor	2024-01-26 10:56:00 -08:00
Patrick Devine	7c40a67841	Save and load sessions (#2063 )	2024-01-25 12:12:36 -08:00
Michael Yang	c08dfaa23d	fix: remove overwritten model layers if create overrides a manifest, first add the older manifest's layers to the delete map so they can be cleaned up	2024-01-19 14:58:37 -08:00
Michael Yang	aac9ab4db7	fix show handler	2024-01-18 15:36:50 -08:00
Michael Yang	745b5934fa	add model to ModelResponse	2024-01-18 14:32:55 -08:00
Michael Yang	a38d88d828	api: add model for all requests prefer using req.Model and fallback to req.Name	2024-01-18 14:31:37 -08:00
Daniel Hiltgen	fedd705aea	Mechanical switch from log to slog A few obvious levels were adjusted, but generally everything mapped to "info" level.	2024-01-18 14:12:57 -08:00
Michael Yang	96cfb62641	fix: normalize name path before splitting	2024-01-16 16:48:29 -08:00
Patrick Devine	eef50accb4	Fix show parameters (#2017 )	2024-01-16 10:34:44 -08:00
Michael Yang	27331ae3a8	download: add inactivity monitor if a download part is inactive for some time, restart it	2024-01-12 15:23:15 -08:00
Michael Yang	cf29bd2d72	fix: request retry with error this fixes a subtle bug with makeRequestWithRetry where an HTTP status error on a retried request will potentially not return the right err	2024-01-12 13:32:27 -08:00
Michael Yang	2b9892a808	fix(windows): modelpath and list	2024-01-09 09:36:58 -08:00
Michael Yang	2bb2bdd5d4	fix lint	2024-01-09 09:36:58 -08:00
Michael Yang	acfc376efd	add .golangci.yaml	2024-01-09 09:36:58 -08:00
Bruce MacDonald	7e8f7c8358	remove ggml automatic re-pull (#1856 )	2024-01-08 14:41:01 -05:00
Michael Yang	0101e76dbe	Merge pull request #1797 from sublimator/nd-allow-extension-origins-still-needs-explicit-listing-2024-01-05 fix: allow extension origins (still needs explicit listing), fixes #1686	2024-01-05 17:20:09 -08:00
Patrick Devine	22e93efa41	add show info command and fix the modelfile	2024-01-05 12:20:05 -08:00
Nicholas Dudfield	8baaaa39c0	Allow extension origins (still needs explicit listing), fixes #1686	2024-01-05 09:06:47 +07:00
Bruce MacDonald	4ad6c9b11f	fix: pull either original model or from model on create (#1774 )	2024-01-04 01:34:38 -05:00
Bruce MacDonald	0b3118e0af	fix: relay request opts to loaded llm prediction (#1761 )	2024-01-03 12:01:42 -05:00
Daniel Hiltgen	697bea6939	Guard integration tests with a tag This should help CI avoid running the integration test logic in a container where it's not currently possible.	2023-12-22 16:33:27 -08:00
Bruce MacDonald	db356c8519	post-response templating (#1427 )	2023-12-22 17:07:05 -05:00
Daniel Hiltgen	96fb441abd	Merge pull request #1146 from dhiltgen/ext_server_cgo Add cgo implementation for llama.cpp	2023-12-22 08:16:31 -08:00
Michael Yang	63aac0edc5	fix(test): use real version string for comparison	2023-12-19 15:03:02 -08:00
Daniel Hiltgen	51082535e1	Add automated test for multimodal A simple test case that verifies llava:7b can read text in an image	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	35934b2e05	Adapted rocm support to cgo based llama.cpp	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	d4cd695759	Add cgo implementation for llama.cpp Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.	2023-12-19 09:05:46 -08:00
Bruce MacDonald	5e7fd6906f	Update images.go	2023-12-19 09:05:46 -08:00
Bruce MacDonald	811b1f03c8	deprecate ggml - remove ggml runner - automatically pull gguf models when ggml detected - tell users to update to gguf in the case automatic pull fails Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-12-19 09:05:46 -08:00
Bruce MacDonald	d99fa6ce0a	send empty messages on last chat response (#1530 )	2023-12-18 14:23:38 -05:00
Patrick Devine	3948c6ea06	add magic header for unit tests (#1558 )	2023-12-18 10:41:02 -08:00
Patrick Devine	86b0dd4b16	add API create/copy handlers (#1541 )	2023-12-15 11:59:18 -08:00
Patrick Devine	0174665d0e	add API tests for list handler (#1535 )	2023-12-14 18:18:25 -08:00
Patrick Devine	630518f0d9	Add unit test of API routes (#1528 )	2023-12-14 16:47:40 -08:00
Bruce MacDonald	6ee8c80199	restore model load duration on generate response (#1524 ) * restore model load duration on generate response - set model load duration on generate and chat done response - calculate createAt time when response created * remove checkpoints predict opts * Update routes.go	2023-12-14 12:15:50 -05:00
Jeffrey Morgan	4a1abfe4fa	fix tests	2023-12-13 14:42:30 -05:00
Patrick Devine	d9e60f634b	add image support to the chat api (#1490 )	2023-12-12 13:28:58 -08:00
Jeffrey Morgan	0a9d348023	Fix issues with `/set template` and `/set system` (#1486 )	2023-12-12 14:43:19 -05:00
Patrick Devine	910e9401d0	Multimodal support (#1216 ) --------- Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>	2023-12-11 13:56:22 -08:00
Jeffrey Morgan	7db5bcf73b	fix `go-staticcheck` warning	2023-12-10 11:44:27 -05:00
Jeffrey Morgan	fa2f095bd9	fix model name returned by `/api/generate` being different than the model name provided	2023-12-10 11:42:15 -05:00
Jeffrey Morgan	045b855db9	fix error on accumulating final chat response	2023-12-10 11:24:39 -05:00
Jeffrey Morgan	32064a0646	fix empty response when receiving runner error	2023-12-10 10:53:38 -05:00
Jeffrey Morgan	9e1406e4ed	Don't expose model information in `/api/generate`	2023-12-09 02:05:43 -08:00
Bruce MacDonald	7e9405fd07	fix: encode full previous prompt in context (#1424 )	2023-12-08 16:53:51 -05:00
Bruce MacDonald	3b0b8930d4	fix: only flush template in chat when current role encountered (#1426 )	2023-12-08 16:44:24 -05:00
Bruce MacDonald	e3f925fc1b	fix: restore modelfile system in prompt template (#1425 )	2023-12-08 14:20:19 -05:00
Michael Yang	1f05d77110	Merge pull request #1244 from jmorganca/brucemacd/no-fail-template do not fail on unsupported template variables	2023-12-06 13:23:04 -08:00
Michael Yang	c3ff36088b	Merge pull request #774 from jmorganca/mxyng/server-version add version api and show server version in cli	2023-12-06 13:22:55 -08:00
Bruce MacDonald	47d4e22673	use missingkey in set empty interface when missing	2023-12-05 15:49:05 -08:00
Michael Yang	5d75505ebd	return model configuration in generate	2023-12-05 14:39:02 -08:00
Michael Yang	b9495ea162	load projectors	2023-12-05 14:36:12 -08:00
Michael Yang	409bb9674e	Merge pull request #1308 from jmorganca/mxyng/split-from split from into one or more models	2023-12-05 14:33:03 -08:00
Michael Yang	d3479c07a1	Merge pull request #1250 from jmorganca/mxyng/create-layer refactor layer creation	2023-12-05 14:32:52 -08:00
Bruce MacDonald	195e3d9dbd	chat api endpoint (#1392 )	2023-12-05 14:57:33 -05:00
Michael Yang	1ebdbd9694	server: add version handler	2023-12-05 09:36:01 -08:00
Jeffrey Morgan	00d06619a1	Revert "chat api (#991 )" while context variable is fixed This reverts commit `7a0899d62d`.	2023-12-04 21:16:27 -08:00
Michael Yang	a3737cbd33	use NewLayer for CreateBlobHandler	2023-12-04 16:59:23 -08:00
Michael Yang	998f1785b6	add modelfamilies	2023-12-04 16:59:23 -08:00
Michael Yang	70a93057cd	refactor layer creation previous layer creation was not ideal because: 1. it required reading the input file multiple times, once to calculate the sha256 checksum, another to write it to disk, and potentially one more to decode the underlying gguf 2. used io.ReadSeeker which is prone to user error. if the file isn't reset correctly or in the right place, it could end up reading an empty file there are also some brittleness when reading existing layers else writing the inherited layers will error reading an already closed file this commit aims to fix these issues by restructuring layer creation. 1. it will now write the layer to a temporary file as well as the hash function and move it to the final location on Commit 2. layers are read once once when copied to the destination. exception is raw model files which still requires a second read to decode the model metadata	2023-12-04 16:59:23 -08:00
Michael Yang	2cb0fa7d40	split from into one or more models	2023-12-04 16:59:23 -08:00
Bruce MacDonald	7a0899d62d	chat api (#991 ) - update chat docs - add messages chat endpoint - remove deprecated context and template generate parameters from docs - context and template are still supported for the time being and will continue to work as expected - add partial response to chat history	2023-12-04 18:01:06 -05:00
Joshua Pham	bb80a597db	Fix adapter loading from SHA hash	2023-12-01 13:50:55 -05:00
Michael Yang	13efd5f218	upload: fix PUT retry	2023-11-29 16:38:35 -08:00
Michael Yang	c4bdfffd96	upload: separate progress tracking	2023-11-29 16:38:33 -08:00
Michael Yang	26c63418e0	new hasher	2023-11-29 14:52:41 -08:00
Michael Yang	2799784ac8	revert checksum calculation to calculate-as-you-go	2023-11-29 13:47:58 -08:00
Bruce MacDonald	96122b7271	validate model tags on copy (#1323 )	2023-11-29 15:54:29 -05:00
Timothy Jaeryang Baek	c2e3b89176	fix: disable ':' in tag names (#1280 ) Co-authored-by: rootedbox	2023-11-29 13:33:45 -05:00
Patrick Devine	cde31cb220	Allow setting parameters in the REPL (#1294 )	2023-11-29 09:56:42 -08:00
Bruce MacDonald	37d95157df	fix relative path on create (#1222 )	2023-11-21 15:43:17 -05:00
Jeffrey Morgan	35c4b5ec16	calculate hash separately from http request	2023-11-20 15:45:11 -05:00
Jeffrey Morgan	9d73d3a6b5	add back `part.Reset()`	2023-11-19 14:32:19 -05:00
Jeffrey Morgan	72cd336410	dont retry on upload complete context cancel	2023-11-19 14:32:19 -05:00
Jeffrey Morgan	1bd594b2fa	revert to using one open file for blob uploads	2023-11-19 14:32:19 -05:00
Jeffrey Morgan	9a8c21ac3d	use exponential everywhere	2023-11-19 14:32:19 -05:00
Jeffrey Morgan	f6b317e8c9	fix sending too little data in chunk upload body	2023-11-19 14:32:19 -05:00
Jeffrey Morgan	ac5076ce1e	exponential backoff up to 30s	2023-11-19 14:32:19 -05:00
Michael Yang	42c2e3a624	upload: retry complete upload	2023-11-19 14:32:19 -05:00
Michael Yang	cb42589792	adjust download/upload parts	2023-11-19 14:32:19 -05:00
Jeffrey Morgan	e1d7056496	update progress statuses	2023-11-19 09:21:13 -05:00
Jeffrey Morgan	02524a56ff	check retry for authorization error	2023-11-19 00:19:53 -05:00
Jeffrey Morgan	12e046f12a	remove unused function	2023-11-18 22:16:51 -05:00
Bruce MacDonald	43a726149d	fix potentially inaccurate error message	2023-11-18 21:25:07 -05:00
Jeffrey Morgan	bab9494176	add `-` separator to temp file created on `ollama create`	2023-11-18 09:39:52 -05:00
Michael Yang	c6e6c8ee7e	fix cross device rename	2023-11-17 15:22:17 -08:00
Michael Yang	c1bbf5ddee	Merge pull request #1134 from jmorganca/mxyng/progress progress bar	2023-11-17 14:03:35 -08:00
Bruce MacDonald	0b19e24d81	only retry once on auth failure (#1175 )	2023-11-17 14:22:35 -05:00
Michael Yang	d6ecaa2cbf	update progress responses	2023-11-17 10:06:19 -08:00
Bruce MacDonald	4b3f4bc7d9	return failure details when unauthorized to push (#1131 ) Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2023-11-16 16:44:18 -05:00
Michael Yang	a5ccf742c1	fix cross repo mounts	2023-11-16 16:33:30 -05:00
Michael Yang	e33ef391cd	fix push scope error for inherited model	2023-11-16 16:33:30 -05:00
Michael Yang	54f92f01cb	update docs	2023-11-15 15:28:15 -08:00
Michael Yang	652d90e1c7	Update server/images.go Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2023-11-15 15:16:23 -08:00
Michael Yang	bc22d5a38b	no blob response	2023-11-15 15:16:23 -08:00
Michael Yang	1901044b07	use checksum reference	2023-11-15 15:16:23 -08:00
Michael Yang	a07c935d34	ignore non blobs	2023-11-15 15:16:23 -08:00
Michael Yang	1552cee59f	client create modelfile	2023-11-15 15:16:23 -08:00
Michael Yang	3ca56b5ada	add create modelfile field	2023-11-15 15:16:23 -08:00
Michael Yang	b0d14ed51c	refactor create model	2023-11-15 15:16:23 -08:00
Michael Yang	d91c103e74	Merge pull request #1055 from dansreis/946-fix-incorrect-base-model-name Fixed incorrect base model name	2023-11-13 08:42:55 -08:00
Daniel Reis	7c438f2c53	Replaced method	2023-11-10 20:22:03 +00:00
Daniel Reis	6e46338d44	Reverting previous changes	2023-11-10 20:21:35 +00:00
Daniel Hiltgen	cc54a416c6	Resume chunk download on UnexpectedEOF errors If the chunk download is interrupted, resume from where we left off	2023-11-10 08:29:42 -08:00
Jeffrey Morgan	5cba29b9d6	JSON mode: add `"format" as an api parameter (#1051 ) * add `"format": "json"` as an API parameter --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2023-11-09 16:44:02 -08:00
Daniel Reis	d17730356a	Removed inline parse model path	2023-11-09 22:44:26 +00:00
Daniel Reis	32d79a6eea	Using 'GetShortTagname' method instead	2023-11-09 22:40:37 +00:00
Bruce MacDonald	ec2a31e9b3	support raw generation requests (#952 ) - add the optional `raw` generate request parameter to bypass prompt formatting and response context -add raw request to docs	2023-11-08 14:05:02 -08:00
Michael Yang	146072113d	Merge pull request #993 from jmorganca/mxyng/cleanup cleanup upload and download errors	2023-11-06 11:32:12 -08:00
Jeffrey Morgan	e21579a0f1	Restore system prompt on requests	2023-11-03 17:26:45 -07:00
Michael Yang	434a6f9d46	return last error	2023-11-03 16:49:51 -07:00
Michael Yang	84725ec7e3	refactor part reset	2023-11-03 09:20:32 -07:00
Noah Gitsham	8ae8c9fa8c	Remove duplicate "install" in GPU support warning (#984 )	2023-11-03 00:45:14 -07:00
Noah Gitsham	f39daff461	Add missing "be" to GPU support warning message (#983 )	2023-11-02 18:37:12 -07:00
Jeffrey Morgan	c50b01bc21	check `request.Context` for initial system prompt	2023-11-02 18:17:00 -07:00
Bruce MacDonald	b9dc875401	remove modelfile context deprecated in v0.0.7 (#974 )	2023-11-02 20:52:56 -04:00
Michael Yang	1fd511e661	Merge pull request #975 from jmorganca/mxyng/downloads update downloads to use retry wrapper	2023-11-02 16:12:48 -07:00
Jeffrey Morgan	1beb5645a9	only use system prompt if context is not provided (#978 )	2023-11-02 15:48:02 -07:00
Michael Yang	fe5a872444	fix upload	2023-11-02 13:25:58 -07:00
Michael Yang	d39709260f	download with retry	2023-11-02 13:16:11 -07:00
Michael Yang	60bb3c03a1	use http.Method	2023-11-02 13:12:45 -07:00
Michael Yang	c4cc738cbf	fix log	2023-11-01 17:18:11 -07:00
Michael Yang	2c6189f4fe	Merge pull request #750 from jmorganca/mxyng/concurrent-uploads concurrent uploads	2023-11-01 15:00:01 -07:00
Bruce MacDonald	f9a4281124	clean up: remove server functions from client (#937 )	2023-10-30 11:10:18 -04:00
Michael Yang	115fc56eb7	calculate and verify md5 checksum	2023-10-27 17:07:33 -07:00
Michael Yang	186f685224	retry PUT	2023-10-27 17:07:33 -07:00
Michael Yang	12efcbb057	comments	2023-10-27 17:07:33 -07:00
Michael Yang	4e09aab8b9	concurrent uploads	2023-10-27 17:07:33 -07:00
Bruce MacDonald	5c3491f425	allow for a configurable ollama model storage directory (#897 ) * allow for a configurable ollama models directory - set OLLAMA_MODELS in the environment that ollama is running in to change where model files are stored - update docs Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com> Co-Authored-By: Jay Nakrani <dhananjaynakrani@gmail.com> Co-Authored-By: Akhil Acharya <akhilcacharya@gmail.com> Co-Authored-By: Sasha Devol <sasha.devol@protonmail.com>	2023-10-27 10:19:59 -04:00
Michael Yang	910816a532	fix(download): no retry when out of space	2023-10-26 11:34:07 -07:00
Michael Yang	386169205c	update runtime options (#864 )	2023-10-20 21:17:14 -04:00
Michael Yang	75bee074b6	fix: nil pointer dereference	2023-10-20 16:55:24 -07:00
Jeffrey Morgan	7ed5a39bc7	simpler check for model loading compatibility errors	2023-10-19 14:50:49 -04:00
Michael Yang	846f593dbf	Merge pull request #828 from jmorganca/mxyng/template-parameters image: show parameters	2023-10-19 09:31:31 -07:00
Michael Yang	e1c5be24e7	check json eof	2023-10-19 09:21:51 -07:00
Michael Yang	2ad8a074ac	generate: set created_at move the empty response so it's more visible	2023-10-19 09:21:51 -07:00
Michael Yang	7e547c6833	s/message/error/	2023-10-19 09:21:04 -07:00
Michael Yang	689842b9ff	request: bad request when model missing fields	2023-10-19 09:21:04 -07:00
Michael Yang	a19d47642e	models: rm workDir from CreateModel unused after removing EMBED	2023-10-19 09:21:04 -07:00
Bruce MacDonald	fe6f3b48f7	do not reload the running llm when runtime params change (#840 ) - only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context	2023-10-19 10:39:58 -04:00
Michael Yang	4dcceeffb7	let the template do the work	2023-10-18 13:12:00 -07:00
Michael Yang	019e4a4558	image: show parameters	2023-10-18 13:12:00 -07:00
Michael Yang	627d04d927	Merge pull request #827 from jmorganca/mxyng/template-adapters model: native gotemplate adapter template	2023-10-18 13:11:25 -07:00
Michael Yang	940e8ebec3	Merge pull request #826 from jmorganca/mxyng/template-system show: no template system if empty	2023-10-18 13:11:09 -07:00
Yiorgis Gozadinos	8c6c2cbc8c	When the .ollama folder is broken or there are no models return an empty list on /api/tags	2023-10-18 08:23:20 +02:00
Michael Yang	8299bf76ed	model: native gotemplate adapter template	2023-10-17 15:28:38 -07:00
Michael Yang	ee4979e510	show: no template system if empty	2023-10-17 15:25:43 -07:00
Michael Yang	1af493c5a0	server: print version on start	2023-10-16 09:59:14 -07:00
Bruce MacDonald	a0c3e989de	deprecate modelfile embed command (#759 )	2023-10-16 11:07:37 -04:00
Michael Yang	7a537cdca9	Merge pull request #770 from jmorganca/mxyng/fix-download fix download	2023-10-12 12:56:43 -07:00
Michael Yang	257ffeb997	fix download	2023-10-12 12:52:43 -07:00
Bruce MacDonald	7804b8fab9	validate api options fields from map (#711 )	2023-10-12 11:18:11 -04:00
Michael Yang	c413a55093	download: handle inner errors	2023-10-11 14:15:30 -07:00
Michael Yang	630bb75d2a	dynamically size download parts based on file size	2023-10-11 14:10:25 -07:00
Michael Yang	a2055a1e93	update download	2023-10-11 14:10:25 -07:00
Bruce MacDonald	274d5a5fdf	optional parameter to not stream response (#639 ) * update streaming request accept header * add optional stream param to request bodies	2023-10-11 12:54:27 -04:00
Jeffrey Morgan	65dcd0ce35	always cleanup blob download (#747 )	2023-10-10 13:12:29 -04:00
Michael Yang	f6e98334e4	handle upstream proxies	2023-10-09 11:42:36 -07:00
Bruce MacDonald	af4cf55884	not found error before pulling model (#718 )	2023-10-06 16:06:20 -04:00
Bruce MacDonald	d6786f2945	add feedback for reading model metadata (#722 )	2023-10-06 16:05:32 -04:00
Michael Yang	0560b28a8d	names	2023-10-06 12:56:56 -07:00
Michael Yang	10199c5987	replace done channel with file check	2023-10-06 12:56:56 -07:00
Michael Yang	288814d3e4	fix ref counts	2023-10-06 12:56:43 -07:00
Michael Yang	04733438da	check head request response	2023-10-06 12:56:43 -07:00
Michael Yang	711e891f0f	fix resumable downloads glob returns files in lexical order which is not appropriate when rebuilding the parts list	2023-10-06 12:56:43 -07:00
Michael Yang	090d08422b	handle unexpected eofs	2023-10-06 12:56:43 -07:00
Michael Yang	5b84404c64	handle concurrent requests for the same blobs	2023-10-06 12:56:43 -07:00
Michael Yang	8544edca21	parallel chunked downloads	2023-10-06 12:56:43 -07:00
Bruce MacDonald	2130c0708b	output type parsed from modelfile (#678 )	2023-10-05 14:58:04 -04:00
Jay Nakrani	1d0ebe67e8	Document response stream chunk delimiter. (#632 ) Document response stream chunk delimiter.	2023-09-29 21:45:52 -07:00
Bruce MacDonald	a1b2d95f96	remove unused push/pull params (#650 )	2023-09-29 17:27:19 -04:00
Michael Yang	9333b0cc82	Merge pull request #612 from jmorganca/mxyng/prune-empty-directories prune empty directories	2023-09-29 11:23:39 -07:00
Michael Yang	f40b3de758	use int64 consistently	2023-09-28 11:07:24 -07:00
Michael Yang	8608eb4760	prune empty directories	2023-09-27 10:58:09 -07:00
Jeffrey Morgan	9b12a511ca	check other request fields before load short circuit in `/api/generate`	2023-09-22 23:50:55 -04:00
Bruce MacDonald	5d71bda478	close llm on interrupt (#577 )	2023-09-22 19:41:52 +01:00
Michael Yang	82f5b66c01	register HEAD /api/tags	2023-09-21 16:38:03 -07:00
Michael Yang	c986694367	fix HEAD / request HEAD request should respond like their GET counterparts except without a response body.	2023-09-21 16:35:58 -07:00
Bruce MacDonald	4cba75efc5	remove tmp directories created by previous servers (#559 ) * remove tmp directories created by previous servers * clean up on server stop * Update routes.go * Update server/routes.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * create top-level temp ollama dir * check file exists before creating --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-21 20:38:49 +01:00
Michael Yang	1fabba474b	refactor default allow origins this should be less error prone	2023-09-21 09:42:25 -07:00
Michael Yang	ee4fd16f2c	Merge pull request #556 from jmorganca/pack-cuda pack in cuda libs	2023-09-20 15:02:36 -07:00
Bruce MacDonald	1255bc9b45	only package 11.8 runner	2023-09-20 20:00:41 +01:00
Michael Yang	499e9007a5	pick chunksize based on location	2023-09-20 11:10:24 -07:00
Michael Yang	aa45d7c1df	draft: explicitly follow upload redirects	2023-09-19 13:36:58 -07:00
Michael Yang	a5520bfb42	fix build	2023-09-19 10:42:24 -07:00
Michael Yang	b58d5d16b0	fix mkdir on windows	2023-09-19 09:41:13 -07:00
Patrick Devine	24580df958	only add a layer if there is actual data (#535 )	2023-09-18 13:47:45 -07:00
Patrick Devine	80dd44e80a	Cmd changes (#541 )	2023-09-18 12:26:56 -07:00
Michael Yang	08d7c2a944	fix error on upload chunk	2023-09-15 15:59:30 -07:00
Michael Yang	e53bc57d4d	split uploadBlobChunked	2023-09-14 17:22:05 -07:00
Michael Yang	f0b398d17f	implement ProgressWriter	2023-09-14 17:22:04 -07:00
Michael Yang	daa4f096f9	set request.ContentLength This informs the HTTP client the content length is known and disables chunked Transfer-Encoding	2023-09-14 13:32:44 -07:00
Michael Yang	e6881cabd0	remove unused	2023-09-13 14:48:33 -07:00
Michael Yang	0c5a454361	fix model type for 70b	2023-09-12 15:12:59 -07:00
Michael Yang	7dee25a07f	fix falcon decode get model and file type from bin file	2023-09-12 12:34:53 -07:00
Bruce MacDonald	f221637053	first pass at linux gpu support (#454 ) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-12 11:04:35 -04:00
Patrick Devine	45ac07cd02	create the blobs directory correctly (#508 )	2023-09-11 14:54:52 -07:00
Patrick Devine	e7e91cd71c	add autoprune to remove unused layers (#491 )	2023-09-11 11:46:35 -07:00
Jeffrey Morgan	3920e15386	add model format to config layer (#497 )	2023-09-09 17:53:44 -04:00
Michael Yang	de227b620f	fix nil pointer dereference	2023-09-07 17:24:31 -07:00
Michael Yang	738fe9c4aa	Merge pull request #486 from jmorganca/mxyng/fix-push fix: retry push on expired token	2023-09-07 13:58:34 -07:00
Michael Yang	bf146fb072	fix retry on unauthorized chunk	2023-09-07 12:02:04 -07:00
Michael Yang	f0f4943577	fix get auth token	2023-09-07 12:01:56 -07:00
Bruce MacDonald	09dd2aeff9	GGUF support (#441 )	2023-09-07 13:55:37 -04:00
Michael Yang	83c6be1666	fix model manifests (#477 )	2023-09-06 17:30:08 -04:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	a1ecdd36d5	create manifests directory	2023-09-05 17:10:40 -07:00
Michael Yang	d1c2558f7e	Merge pull request #461 from jmorganca/mxyng/fix-inherit-params fix inherit params	2023-09-05 12:30:23 -07:00
Michael Yang	06ef90c051	fix parameter inheritence parameters are not inherited because they are processed differently from other layer. fix this by explicitly merging the inherited params into the new params. parameter values defined in the new modelfile will override those defined in the inherited modelfile. array lists are replaced instead of appended	2023-09-05 11:40:20 -07:00
Michael Yang	e9f6df7dca	use slices.DeleteFunc	2023-09-05 09:56:59 -07:00
Michael Yang	681f3c4c42	fix num_keep	2023-09-03 17:47:49 -04:00
Quinn Slack	62d29b2157	do not HTML-escape prompt The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `<h1>hello` before being passed to the LLM. The included test case passes, but before the code change, it failed: ``` --- FAIL: TestModelPrompt images_test.go:21: got "a<h1>b", want "a<h1>b" ```	2023-09-01 17:16:38 -05:00
Michael Yang	1c8fd627ad	windows: fix create modelfile	2023-08-31 09:47:10 -04:00
Michael Yang	ae950b00f1	windows: fix delete	2023-08-31 09:47:10 -04:00
Michael Yang	eeb40a672c	fix list models for windows	2023-08-31 09:47:10 -04:00
Michael Yang	0f541a0367	s/ListResponseModel/ModelResponse/	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Quinn Slack	f4432e1dba	treat stop as stop sequences, not exact tokens (#442 ) The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list. Fixes https://github.com/jmorganca/ollama/issues/295.	2023-08-30 11:53:42 -04:00
Michael Yang	982c535428	Merge pull request #428 from jmorganca/mxyng/upload-chunks update upload chunks	2023-08-30 07:47:17 -07:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	16b06699fd	remove unused parameter	2023-08-28 18:35:18 -04:00
Michael Yang	246dc65417	loosen http status code checks	2023-08-28 18:34:53 -04:00
Michael Yang	865fceb73c	chunked pipe	2023-08-28 18:34:53 -04:00
Michael Yang	72266c7684	bump chunk size to 95MB	2023-08-28 18:34:53 -04:00
Michael Yang	59734ca24d	set default template	2023-08-26 12:20:48 -07:00
Michael Yang	32d1a00017	remove unused requestContextKey	2023-08-22 10:49:54 -07:00
Michael Yang	04e2128273	move upload funcs to upload.go	2023-08-22 10:49:53 -07:00
Michael Yang	2cc634689b	use url.URL	2023-08-22 10:49:07 -07:00
Michael Yang	95187d7e1e	build release mode	2023-08-22 09:52:43 -07:00
Michael Yang	9ec7e37534	Merge pull request #392 from jmorganca/mxyng/version add version	2023-08-22 09:50:25 -07:00
Michael Yang	2c7f956b38	add version	2023-08-22 09:40:58 -07:00
Jeffrey Morgan	a9f6c56652	fix `FROM` instruction erroring when referring to a file	2023-08-22 09:39:42 -07:00
Ryan Baker	0a892419ad	Strip protocol from model path (#377 )	2023-08-21 21:56:56 -07:00
Michael Yang	3b49315f97	retry on unauthorized chunk push The token printed for authorized requests has a lifetime of 1h. If an upload exceeds 1h, a chunk push will fail since the token is created on a "start upload" request. This replaces the Pipe with SectionReader which is simpler and implements Seek, a requirement for makeRequestWithRetry. This is slightly worse than using a Pipe since the progress update is directly tied to the chunk size instead of controlled separately.	2023-08-18 11:23:47 -07:00
Michael Yang	7eda70f23b	copy metadata from source	2023-08-17 21:55:25 -07:00
Michael Yang	086449b6c7	fmt	2023-08-17 15:32:31 -07:00
Michael Yang	3cbc6a5c01	fix push manifest	2023-08-17 15:28:12 -07:00
Michael Yang	a894cc792d	model and file type as strings	2023-08-17 12:08:04 -07:00
Michael Yang	b963a83559	Merge pull request #364 from jmorganca/chunked-uploads reimplement chunked uploads	2023-08-17 09:58:51 -07:00
Michael Yang	bf6688abe6	Merge pull request #360 from jmorganca/fix-request-copies Fix request copies	2023-08-17 09:58:42 -07:00
Bruce MacDonald	6005b157c2	retry download on network errors	2023-08-17 10:31:45 -04:00
Patrick Devine	14220d9833	set the scopes correctly (#368 )	2023-08-16 21:42:02 -07:00
Michael Yang	5dfe91be8b	reimplement chunked uploads	2023-08-16 14:50:24 -07:00
Michael Yang	9f944c00f1	push: retry on unauthorized	2023-08-16 11:35:33 -07:00
Michael Yang	56e87cecb1	images: remove body copies	2023-08-16 10:30:41 -07:00
Michael Yang	5d9a4cd251	Merge pull request #348 from jmorganca/cross-repo-mount cross repo blob mount	2023-08-16 09:20:36 -07:00
Bruce MacDonald	1deb35ca64	use loaded llm for generating model file embeddings	2023-08-15 16:12:02 -03:00
Bruce MacDonald	e2de886831	do not regenerate embeddings	2023-08-15 16:10:22 -03:00
Bruce MacDonald	f0d7c2f5ea	retry download on network errors	2023-08-15 15:07:19 -03:00
Bruce MacDonald	12052a7624	always remove from in progress map on download	2023-08-15 13:20:32 -03:00
Bruce MacDonald	326de48930	use loaded llm for embeddings	2023-08-15 10:50:54 -03:00
Bruce MacDonald	18f2cb0472	dont log fatal	2023-08-15 10:39:59 -03:00
Michael Yang	e26085b921	close open files	2023-08-14 16:08:06 -07:00
Michael Yang	f594c8eb91	cross repo mount	2023-08-14 15:07:35 -07:00
Bruce MacDonald	f020e1d519	always remove from in progress map on download	2023-08-14 13:09:20 -03:00
Bruce MacDonald	2c8b680b03	use file info for embeddings cache	2023-08-14 12:11:04 -03:00
Bruce MacDonald	99b6b60085	use model bin digest for embed digest	2023-08-14 11:57:12 -03:00
Bruce MacDonald	e9a9580bdd	do not regenerate embeddings - re-use previously evaluated embeddings when possible - change embeddings digest identifier to be based on model name and embedded file path	2023-08-14 10:34:17 -03:00
Patrick Devine	d9cf18e28d	add maximum retries when pushing (#334 )	2023-08-11 15:41:55 -07:00
Jeffrey Morgan	1556162c90	create `.ollama` directory if it doesnt exist	2023-08-11 15:35:55 -07:00
Jeffrey Morgan	148f0225c0	create `.ollama` directory if it doesnt exist	2023-08-11 15:33:11 -07:00
Michael Yang	6517bcc53c	Merge pull request #290 from jmorganca/add-adapter-layers implement loading ggml lora adapters through the modelfile	2023-08-10 17:23:01 -07:00
Michael Yang	6a6828bddf	Merge pull request #167 from jmorganca/decode-ggml partial decode ggml bin for more info	2023-08-10 17:22:40 -07:00
Patrick Devine	be989d89d1	Token auth (#314 )	2023-08-10 11:34:25 -07:00
Jeffrey Morgan	040a5b9750	clean up cli flags	2023-08-10 09:27:03 -07:00
Michael Yang	6de5d032e1	implement loading ggml lora adapters through the modelfile	2023-08-10 09:23:39 -07:00
Michael Yang	fccf8d179f	partial decode ggml bin for more info	2023-08-10 09:23:10 -07:00
Bruce MacDonald	4b3507f036	embeddings endpoint Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-10 11:45:57 -04:00
Bruce MacDonald	984c9c628c	fix embeddings invalid values	2023-08-09 16:50:53 -04:00
Bruce MacDonald	ac971c56d1	Update images.go	2023-08-09 11:31:54 -04:00
Bruce MacDonald	8228d166ce	pr comments	2023-08-09 11:31:54 -04:00
Bruce MacDonald	907e6c56b3	unlock downloadu in case or requestDownload err	2023-08-09 11:31:54 -04:00
Bruce MacDonald	868e3b31c7	allow for concurrent pulls of the same files	2023-08-09 11:31:54 -04:00
Bruce MacDonald	09d8bf6730	fix build errors	2023-08-09 10:45:57 -04:00
Bruce MacDonald	7a5f3616fd	embed text document in modelfile	2023-08-09 10:26:19 -04:00
Jeffrey Morgan	cff002b824	use content type `application/x-ndjson` for streaming responses	2023-08-08 21:38:10 -07:00
Bruce MacDonald	1bee2347be	pr feedback - defer closing llm on embedding - do not override licenses - remove debugging print line - reformat model file docs	2023-08-08 17:01:37 -04:00
Jeffrey Morgan	a027a7dd65	add `0.0.0.0` as an allowed origin by default Fixes #282	2023-08-08 13:39:50 -07:00
Bruce MacDonald	884d78ceb3	allow embedding from model binary	2023-08-08 14:38:57 -04:00
Bruce MacDonald	21ddcaa1f1	pr comments - default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload	2023-08-08 13:49:37 -04:00
Michael Yang	f2074ed4c0	Merge pull request #306 from jmorganca/default-keep-system automatically set num_keep if num_keep < 0	2023-08-08 09:25:34 -07:00
Bruce MacDonald	a6f6d18f83	embed text document in modelfile	2023-08-08 11:27:17 -04:00
Bruce MacDonald	34a13a9d05	pass flags to `serve` to allow setting allowed-origins + host and port	2023-08-08 10:41:42 -04:00
Jeffrey Morgan	8713ac23a8	allow overriding `template` and `system` in `/api/generate` Fixes #297 Fixes #296	2023-08-08 00:55:34 -04:00
Michael Yang	4dc5b117dd	automatically set num_keep if num_keep < 0 num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions	2023-08-07 16:19:12 -07:00
cmiller01	fb593b7bfc	pass flags to `serve` to allow setting allowed-origins + host and port * resolves: https://github.com/jmorganca/ollama/issues/300 and https://github.com/jmorganca/ollama/issues/282 * example usage: ``` ollama serve --port 9999 --allowed-origins "http://foo.example.com,http://192.0.0.1" ```	2023-08-07 03:34:37 +00:00
Jeffrey Morgan	e3fb1fd3f1	server: compare options correctly	2023-08-03 15:55:40 -04:00
Michael Yang	a71ff3f6a2	use a pipe to push to registry with progress switch to a monolithic upload instead of a chunk upload through a pipe to report progress	2023-08-03 10:37:13 -07:00
Bruce MacDonald	8b1e791820	allow specifying zero values in modelfile	2023-08-02 17:07:53 -04:00
Jeffrey Morgan	03cff3a225	server: reset digest at end of generate	2023-08-02 16:15:44 -04:00
Bruce MacDonald	8f8b6288ac	check server is running before running command	2023-08-02 10:51:23 -04:00
Bruce MacDonald	765994362c	use head to check heartbeat	2023-08-01 14:50:38 -04:00
Bruce MacDonald	1c5a8770ee	read runner parameter options from map - read runner options from map to see what was specified explicitly and overwrite zero values	2023-08-01 13:38:19 -04:00
Bruce MacDonald	daa0d1de7a	allow specifying zero values in modelfile	2023-08-01 13:37:50 -04:00
Jeffrey Morgan	528bafa585	cache loaded model	2023-08-01 11:24:18 -04:00
Michael Yang	872011630a	fix license	2023-07-31 21:46:48 -07:00
Michael Yang	203fdbc4b8	check err	2023-07-31 21:46:48 -07:00
Michael Yang	70e0ab6b3d	remove unnecessary fmt.Sprintf	2023-07-31 21:46:47 -07:00
Jeffrey Morgan	9968153729	fix Go warnings	2023-07-31 21:37:40 -04:00
Bruce MacDonald	671eec6da9	log prediction failures	2023-07-31 16:46:37 -04:00
Michael Yang	eadee46840	Merge pull request #236 from jmorganca/check-os-walk check os.Walk err	2023-07-28 14:14:21 -07:00
Michael Yang	bd58528fbd	check os.Walk err	2023-07-28 12:15:31 -07:00
Michael Yang	c5e447a359	remove io/ioutil import ioutil is deprecated	2023-07-28 12:06:03 -07:00
Bruce MacDonald	f5cbcb08e6	specify stop params separately	2023-07-28 11:29:00 -04:00
Bruce MacDonald	184ad8f057	allow specifying stop conditions in modelfile	2023-07-28 11:02:04 -04:00
Bruce MacDonald	0345070dfa	update model file docs	2023-07-28 10:33:52 -04:00
Bruce MacDonald	1ac38ec89c	improve modelfile docs	2023-07-27 15:13:04 -04:00

... 5 6 7 8 9 ...

751 commits