ollama

Author	SHA1	Message	Date
Jeffrey Morgan	63861f58cc	Support for `bert` and `nomic-bert` embedding models	2024-02-20 21:37:29 -05:00
Bruce MacDonald	88622847c6	fix: chat system prompting overrides (#2542 )	2024-02-16 14:42:43 -05:00
Michael Yang	e43648afe5	rerefactor	2024-02-15 05:56:45 +00:00
Daniel Hiltgen	f397e0e988	Move hub auth out to new package	2024-02-15 05:56:45 +00:00
Jeffrey Morgan	48a273f80b	Fix issues with templating prompt in chat mode (#2460 )	2024-02-12 15:06:57 -08:00
Jeffrey Morgan	1f9078d6ae	Check image filetype in api handlers (#2467 )	2024-02-12 11:16:20 -08:00
Jeffrey Morgan	a0a199b108	Fix hanging issue when sending empty content (#2399 )	2024-02-07 19:30:33 -05:00
Jeffrey Morgan	453f572f83	Initial OpenAI `/v1/chat/completions` API compatibility (#2376 )	2024-02-07 17:24:29 -05:00
Michael Yang	bfbf2f7cf7	Merge pull request #2296 from ollama/mxyng/img-tags append image tags to user content	2024-02-01 13:16:59 -08:00
Michael Yang	3d6f48507a	structured debug prompt	2024-02-01 11:56:28 -08:00
Michael Yang	d125510b4b	remove image tags	2024-02-01 11:32:51 -08:00
Michael Yang	fb56988014	account for image projection in token count	2024-02-01 09:50:48 -08:00
Michael Yang	d046bee790	use llm.ImageData for chat	2024-01-31 19:18:25 -08:00
Jeffrey Morgan	f11bf0740b	use `llm.ImageData`	2024-01-31 19:13:48 -08:00
Michael Yang	8450bf66e6	trim images	2024-01-31 19:13:47 -08:00
Michael Yang	b4e11be8ef	append image tags to user content	2024-01-31 19:13:10 -08:00
Michael Yang	8ac08a0eec	update slog handler options - consistent format by using text handler for debug and non-debug - truncate source file to just the file name	2024-01-31 15:15:00 -08:00
Bruce MacDonald	0632dff3f8	trim chat prompt based on llm context size (#1963 )	2024-01-30 15:59:29 -05:00
Jeffrey Morgan	f2245c7c77	print prompt with `OLLAMA_DEBUG=1` (#2245 )	2024-01-28 15:22:35 -08:00
Jeffrey Morgan	e4b9b72f2a	Do not repeat system prompt for chat templating (#2241 )	2024-01-28 14:15:56 -08:00
Patrick Devine	b5cf31b460	add keep_alive to generate/chat/embedding api endpoints (#2146 )	2024-01-26 14:28:02 -08:00
Patrick Devine	7c40a67841	Save and load sessions (#2063 )	2024-01-25 12:12:36 -08:00
Michael Yang	aac9ab4db7	fix show handler	2024-01-18 15:36:50 -08:00
Michael Yang	745b5934fa	add model to ModelResponse	2024-01-18 14:32:55 -08:00
Michael Yang	a38d88d828	api: add model for all requests prefer using req.Model and fallback to req.Name	2024-01-18 14:31:37 -08:00
Daniel Hiltgen	fedd705aea	Mechanical switch from log to slog A few obvious levels were adjusted, but generally everything mapped to "info" level.	2024-01-18 14:12:57 -08:00
Patrick Devine	eef50accb4	Fix show parameters (#2017 )	2024-01-16 10:34:44 -08:00
Michael Yang	2b9892a808	fix(windows): modelpath and list	2024-01-09 09:36:58 -08:00
Michael Yang	2bb2bdd5d4	fix lint	2024-01-09 09:36:58 -08:00
Michael Yang	acfc376efd	add .golangci.yaml	2024-01-09 09:36:58 -08:00
Michael Yang	0101e76dbe	Merge pull request #1797 from sublimator/nd-allow-extension-origins-still-needs-explicit-listing-2024-01-05 fix: allow extension origins (still needs explicit listing), fixes #1686	2024-01-05 17:20:09 -08:00
Patrick Devine	22e93efa41	add show info command and fix the modelfile	2024-01-05 12:20:05 -08:00
Nicholas Dudfield	8baaaa39c0	Allow extension origins (still needs explicit listing), fixes #1686	2024-01-05 09:06:47 +07:00
Bruce MacDonald	0b3118e0af	fix: relay request opts to loaded llm prediction (#1761 )	2024-01-03 12:01:42 -05:00
Bruce MacDonald	db356c8519	post-response templating (#1427 )	2023-12-22 17:07:05 -05:00
Daniel Hiltgen	35934b2e05	Adapted rocm support to cgo based llama.cpp	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	d4cd695759	Add cgo implementation for llama.cpp Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.	2023-12-19 09:05:46 -08:00
Bruce MacDonald	811b1f03c8	deprecate ggml - remove ggml runner - automatically pull gguf models when ggml detected - tell users to update to gguf in the case automatic pull fails Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-12-19 09:05:46 -08:00
Bruce MacDonald	d99fa6ce0a	send empty messages on last chat response (#1530 )	2023-12-18 14:23:38 -05:00
Patrick Devine	630518f0d9	Add unit test of API routes (#1528 )	2023-12-14 16:47:40 -08:00
Bruce MacDonald	6ee8c80199	restore model load duration on generate response (#1524 ) * restore model load duration on generate response - set model load duration on generate and chat done response - calculate createAt time when response created * remove checkpoints predict opts * Update routes.go	2023-12-14 12:15:50 -05:00
Patrick Devine	d9e60f634b	add image support to the chat api (#1490 )	2023-12-12 13:28:58 -08:00
Patrick Devine	910e9401d0	Multimodal support (#1216 ) --------- Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>	2023-12-11 13:56:22 -08:00
Jeffrey Morgan	7db5bcf73b	fix `go-staticcheck` warning	2023-12-10 11:44:27 -05:00
Jeffrey Morgan	fa2f095bd9	fix model name returned by `/api/generate` being different than the model name provided	2023-12-10 11:42:15 -05:00
Jeffrey Morgan	045b855db9	fix error on accumulating final chat response	2023-12-10 11:24:39 -05:00
Jeffrey Morgan	32064a0646	fix empty response when receiving runner error	2023-12-10 10:53:38 -05:00
Jeffrey Morgan	9e1406e4ed	Don't expose model information in `/api/generate`	2023-12-09 02:05:43 -08:00
Bruce MacDonald	7e9405fd07	fix: encode full previous prompt in context (#1424 )	2023-12-08 16:53:51 -05:00
Michael Yang	c3ff36088b	Merge pull request #774 from jmorganca/mxyng/server-version add version api and show server version in cli	2023-12-06 13:22:55 -08:00
Michael Yang	5d75505ebd	return model configuration in generate	2023-12-05 14:39:02 -08:00
Michael Yang	b9495ea162	load projectors	2023-12-05 14:36:12 -08:00
Michael Yang	d3479c07a1	Merge pull request #1250 from jmorganca/mxyng/create-layer refactor layer creation	2023-12-05 14:32:52 -08:00
Bruce MacDonald	195e3d9dbd	chat api endpoint (#1392 )	2023-12-05 14:57:33 -05:00
Michael Yang	1ebdbd9694	server: add version handler	2023-12-05 09:36:01 -08:00
Jeffrey Morgan	00d06619a1	Revert "chat api (#991 )" while context variable is fixed This reverts commit `7a0899d62d`.	2023-12-04 21:16:27 -08:00
Michael Yang	a3737cbd33	use NewLayer for CreateBlobHandler	2023-12-04 16:59:23 -08:00
Bruce MacDonald	7a0899d62d	chat api (#991 ) - update chat docs - add messages chat endpoint - remove deprecated context and template generate parameters from docs - context and template are still supported for the time being and will continue to work as expected - add partial response to chat history	2023-12-04 18:01:06 -05:00
Bruce MacDonald	96122b7271	validate model tags on copy (#1323 )	2023-11-29 15:54:29 -05:00
Timothy Jaeryang Baek	c2e3b89176	fix: disable ':' in tag names (#1280 ) Co-authored-by: rootedbox	2023-11-29 13:33:45 -05:00
Bruce MacDonald	37d95157df	fix relative path on create (#1222 )	2023-11-21 15:43:17 -05:00
Bruce MacDonald	43a726149d	fix potentially inaccurate error message	2023-11-18 21:25:07 -05:00
Jeffrey Morgan	bab9494176	add `-` separator to temp file created on `ollama create`	2023-11-18 09:39:52 -05:00
Michael Yang	c6e6c8ee7e	fix cross device rename	2023-11-17 15:22:17 -08:00
Michael Yang	54f92f01cb	update docs	2023-11-15 15:28:15 -08:00
Michael Yang	bc22d5a38b	no blob response	2023-11-15 15:16:23 -08:00
Michael Yang	1901044b07	use checksum reference	2023-11-15 15:16:23 -08:00
Michael Yang	1552cee59f	client create modelfile	2023-11-15 15:16:23 -08:00
Michael Yang	3ca56b5ada	add create modelfile field	2023-11-15 15:16:23 -08:00
Michael Yang	b0d14ed51c	refactor create model	2023-11-15 15:16:23 -08:00
Jeffrey Morgan	5cba29b9d6	JSON mode: add `"format" as an api parameter (#1051 ) * add `"format": "json"` as an API parameter --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2023-11-09 16:44:02 -08:00
Bruce MacDonald	ec2a31e9b3	support raw generation requests (#952 ) - add the optional `raw` generate request parameter to bypass prompt formatting and response context -add raw request to docs	2023-11-08 14:05:02 -08:00
Noah Gitsham	8ae8c9fa8c	Remove duplicate "install" in GPU support warning (#984 )	2023-11-03 00:45:14 -07:00
Noah Gitsham	f39daff461	Add missing "be" to GPU support warning message (#983 )	2023-11-02 18:37:12 -07:00
Michael Yang	2c6189f4fe	Merge pull request #750 from jmorganca/mxyng/concurrent-uploads concurrent uploads	2023-11-01 15:00:01 -07:00
Bruce MacDonald	f9a4281124	clean up: remove server functions from client (#937 )	2023-10-30 11:10:18 -04:00
Michael Yang	4e09aab8b9	concurrent uploads	2023-10-27 17:07:33 -07:00
Michael Yang	386169205c	update runtime options (#864 )	2023-10-20 21:17:14 -04:00
Jeffrey Morgan	7ed5a39bc7	simpler check for model loading compatibility errors	2023-10-19 14:50:49 -04:00
Michael Yang	e1c5be24e7	check json eof	2023-10-19 09:21:51 -07:00
Michael Yang	2ad8a074ac	generate: set created_at move the empty response so it's more visible	2023-10-19 09:21:51 -07:00
Michael Yang	7e547c6833	s/message/error/	2023-10-19 09:21:04 -07:00
Michael Yang	689842b9ff	request: bad request when model missing fields	2023-10-19 09:21:04 -07:00
Michael Yang	a19d47642e	models: rm workDir from CreateModel unused after removing EMBED	2023-10-19 09:21:04 -07:00
Bruce MacDonald	fe6f3b48f7	do not reload the running llm when runtime params change (#840 ) - only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context	2023-10-19 10:39:58 -04:00
Yiorgis Gozadinos	8c6c2cbc8c	When the .ollama folder is broken or there are no models return an empty list on /api/tags	2023-10-18 08:23:20 +02:00
Michael Yang	1af493c5a0	server: print version on start	2023-10-16 09:59:14 -07:00
Bruce MacDonald	a0c3e989de	deprecate modelfile embed command (#759 )	2023-10-16 11:07:37 -04:00
Bruce MacDonald	7804b8fab9	validate api options fields from map (#711 )	2023-10-12 11:18:11 -04:00
Bruce MacDonald	274d5a5fdf	optional parameter to not stream response (#639 ) * update streaming request accept header * add optional stream param to request bodies	2023-10-11 12:54:27 -04:00
Bruce MacDonald	af4cf55884	not found error before pulling model (#718 )	2023-10-06 16:06:20 -04:00
Jay Nakrani	1d0ebe67e8	Document response stream chunk delimiter. (#632 ) Document response stream chunk delimiter.	2023-09-29 21:45:52 -07:00
Bruce MacDonald	a1b2d95f96	remove unused push/pull params (#650 )	2023-09-29 17:27:19 -04:00
Michael Yang	8608eb4760	prune empty directories	2023-09-27 10:58:09 -07:00
Jeffrey Morgan	9b12a511ca	check other request fields before load short circuit in `/api/generate`	2023-09-22 23:50:55 -04:00
Bruce MacDonald	5d71bda478	close llm on interrupt (#577 )	2023-09-22 19:41:52 +01:00
Michael Yang	82f5b66c01	register HEAD /api/tags	2023-09-21 16:38:03 -07:00
Michael Yang	c986694367	fix HEAD / request HEAD request should respond like their GET counterparts except without a response body.	2023-09-21 16:35:58 -07:00
Bruce MacDonald	4cba75efc5	remove tmp directories created by previous servers (#559 ) * remove tmp directories created by previous servers * clean up on server stop * Update routes.go * Update server/routes.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * create top-level temp ollama dir * check file exists before creating --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-21 20:38:49 +01:00
Michael Yang	1fabba474b	refactor default allow origins this should be less error prone	2023-09-21 09:42:25 -07:00
Bruce MacDonald	1255bc9b45	only package 11.8 runner	2023-09-20 20:00:41 +01:00
Patrick Devine	80dd44e80a	Cmd changes (#541 )	2023-09-18 12:26:56 -07:00
Bruce MacDonald	f221637053	first pass at linux gpu support (#454 ) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-12 11:04:35 -04:00
Patrick Devine	e7e91cd71c	add autoprune to remove unused layers (#491 )	2023-09-11 11:46:35 -07:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	681f3c4c42	fix num_keep	2023-09-03 17:47:49 -04:00
Michael Yang	eeb40a672c	fix list models for windows	2023-08-31 09:47:10 -04:00
Michael Yang	0f541a0367	s/ListResponseModel/ModelResponse/	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	95187d7e1e	build release mode	2023-08-22 09:52:43 -07:00
Jeffrey Morgan	a9f6c56652	fix `FROM` instruction erroring when referring to a file	2023-08-22 09:39:42 -07:00
Ryan Baker	0a892419ad	Strip protocol from model path (#377 )	2023-08-21 21:56:56 -07:00
Bruce MacDonald	326de48930	use loaded llm for embeddings	2023-08-15 10:50:54 -03:00
Patrick Devine	d9cf18e28d	add maximum retries when pushing (#334 )	2023-08-11 15:41:55 -07:00
Michael Yang	6517bcc53c	Merge pull request #290 from jmorganca/add-adapter-layers implement loading ggml lora adapters through the modelfile	2023-08-10 17:23:01 -07:00
Michael Yang	6a6828bddf	Merge pull request #167 from jmorganca/decode-ggml partial decode ggml bin for more info	2023-08-10 17:22:40 -07:00
Jeffrey Morgan	040a5b9750	clean up cli flags	2023-08-10 09:27:03 -07:00
Michael Yang	6de5d032e1	implement loading ggml lora adapters through the modelfile	2023-08-10 09:23:39 -07:00
Michael Yang	fccf8d179f	partial decode ggml bin for more info	2023-08-10 09:23:10 -07:00
Bruce MacDonald	4b3507f036	embeddings endpoint Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-10 11:45:57 -04:00
Bruce MacDonald	868e3b31c7	allow for concurrent pulls of the same files	2023-08-09 11:31:54 -04:00
Bruce MacDonald	09d8bf6730	fix build errors	2023-08-09 10:45:57 -04:00
Bruce MacDonald	7a5f3616fd	embed text document in modelfile	2023-08-09 10:26:19 -04:00
Jeffrey Morgan	cff002b824	use content type `application/x-ndjson` for streaming responses	2023-08-08 21:38:10 -07:00
Jeffrey Morgan	a027a7dd65	add `0.0.0.0` as an allowed origin by default Fixes #282	2023-08-08 13:39:50 -07:00
Bruce MacDonald	21ddcaa1f1	pr comments - default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload	2023-08-08 13:49:37 -04:00
Michael Yang	f2074ed4c0	Merge pull request #306 from jmorganca/default-keep-system automatically set num_keep if num_keep < 0	2023-08-08 09:25:34 -07:00
Bruce MacDonald	a6f6d18f83	embed text document in modelfile	2023-08-08 11:27:17 -04:00
Michael Yang	4dc5b117dd	automatically set num_keep if num_keep < 0 num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions	2023-08-07 16:19:12 -07:00
cmiller01	fb593b7bfc	pass flags to `serve` to allow setting allowed-origins + host and port * resolves: https://github.com/jmorganca/ollama/issues/300 and https://github.com/jmorganca/ollama/issues/282 * example usage: ``` ollama serve --port 9999 --allowed-origins "http://foo.example.com,http://192.0.0.1" ```	2023-08-07 03:34:37 +00:00
Jeffrey Morgan	e3fb1fd3f1	server: compare options correctly	2023-08-03 15:55:40 -04:00
Bruce MacDonald	8b1e791820	allow specifying zero values in modelfile	2023-08-02 17:07:53 -04:00
Jeffrey Morgan	03cff3a225	server: reset digest at end of generate	2023-08-02 16:15:44 -04:00
Bruce MacDonald	8f8b6288ac	check server is running before running command	2023-08-02 10:51:23 -04:00
Bruce MacDonald	765994362c	use head to check heartbeat	2023-08-01 14:50:38 -04:00
Bruce MacDonald	1c5a8770ee	read runner parameter options from map - read runner options from map to see what was specified explicitly and overwrite zero values	2023-08-01 13:38:19 -04:00
Bruce MacDonald	daa0d1de7a	allow specifying zero values in modelfile	2023-08-01 13:37:50 -04:00
Jeffrey Morgan	528bafa585	cache loaded model	2023-08-01 11:24:18 -04:00
Bruce MacDonald	671eec6da9	log prediction failures	2023-07-31 16:46:37 -04:00
Michael Yang	f62a882760	add session expiration	2023-07-27 09:31:44 -07:00
Michael Yang	32aec66e6a	add load duration	2023-07-27 09:31:44 -07:00
Michael Yang	35af37a2cb	session id	2023-07-27 09:31:44 -07:00
Bruce MacDonald	4c1caa3733	download models when creating from modelfile	2023-07-25 14:25:13 -04:00
Patrick Devine	4cb42ca55e	add copy command (#191 )	2023-07-24 11:27:28 -04:00
Michael Yang	8609db77ea	use gin-contrib/cors middleware	2023-07-22 09:39:08 -07:00
Patrick Devine	6d6b0d3321	change error handler behavior and fix error when a model isn't found (#173 )	2023-07-21 23:02:12 -07:00
Patrick Devine	9f6e97865c	allow pushing/pulling to insecure registries (#157 )	2023-07-21 15:42:19 -07:00
Bruce MacDonald	7ba1308595	Merge pull request #147 from jmorganca/brucemacd/cli-err-display Improve CLI error display	2023-07-21 16:10:19 +02:00
Patrick Devine	e7a393de54	add rm command for models (#151 )	2023-07-20 16:09:23 -07:00

1 2 3 4 5 ...

296 commits