ollama

Author	SHA1	Message	Date
Jeffrey Morgan	9b12a511ca	check other request fields before load short circuit in `/api/generate`	2023-09-22 23:50:55 -04:00
Bruce MacDonald	5d71bda478	close llm on interrupt (#577 )	2023-09-22 19:41:52 +01:00
Michael Yang	82f5b66c01	register HEAD /api/tags	2023-09-21 16:38:03 -07:00
Michael Yang	c986694367	fix HEAD / request HEAD request should respond like their GET counterparts except without a response body.	2023-09-21 16:35:58 -07:00
Bruce MacDonald	4cba75efc5	remove tmp directories created by previous servers (#559 ) * remove tmp directories created by previous servers * clean up on server stop * Update routes.go * Update server/routes.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * create top-level temp ollama dir * check file exists before creating --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-21 20:38:49 +01:00
Michael Yang	1fabba474b	refactor default allow origins this should be less error prone	2023-09-21 09:42:25 -07:00
Bruce MacDonald	1255bc9b45	only package 11.8 runner	2023-09-20 20:00:41 +01:00
Patrick Devine	80dd44e80a	Cmd changes (#541 )	2023-09-18 12:26:56 -07:00
Bruce MacDonald	f221637053	first pass at linux gpu support (#454 ) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-12 11:04:35 -04:00
Patrick Devine	e7e91cd71c	add autoprune to remove unused layers (#491 )	2023-09-11 11:46:35 -07:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	681f3c4c42	fix num_keep	2023-09-03 17:47:49 -04:00
Michael Yang	eeb40a672c	fix list models for windows	2023-08-31 09:47:10 -04:00
Michael Yang	0f541a0367	s/ListResponseModel/ModelResponse/	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	95187d7e1e	build release mode	2023-08-22 09:52:43 -07:00
Jeffrey Morgan	a9f6c56652	fix `FROM` instruction erroring when referring to a file	2023-08-22 09:39:42 -07:00
Ryan Baker	0a892419ad	Strip protocol from model path (#377 )	2023-08-21 21:56:56 -07:00
Bruce MacDonald	326de48930	use loaded llm for embeddings	2023-08-15 10:50:54 -03:00
Patrick Devine	d9cf18e28d	add maximum retries when pushing (#334 )	2023-08-11 15:41:55 -07:00
Michael Yang	6517bcc53c	Merge pull request #290 from jmorganca/add-adapter-layers implement loading ggml lora adapters through the modelfile	2023-08-10 17:23:01 -07:00
Michael Yang	6a6828bddf	Merge pull request #167 from jmorganca/decode-ggml partial decode ggml bin for more info	2023-08-10 17:22:40 -07:00
Jeffrey Morgan	040a5b9750	clean up cli flags	2023-08-10 09:27:03 -07:00
Michael Yang	6de5d032e1	implement loading ggml lora adapters through the modelfile	2023-08-10 09:23:39 -07:00
Michael Yang	fccf8d179f	partial decode ggml bin for more info	2023-08-10 09:23:10 -07:00
Bruce MacDonald	4b3507f036	embeddings endpoint Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-08-10 11:45:57 -04:00
Bruce MacDonald	868e3b31c7	allow for concurrent pulls of the same files	2023-08-09 11:31:54 -04:00
Bruce MacDonald	09d8bf6730	fix build errors	2023-08-09 10:45:57 -04:00
Bruce MacDonald	7a5f3616fd	embed text document in modelfile	2023-08-09 10:26:19 -04:00
Jeffrey Morgan	cff002b824	use content type `application/x-ndjson` for streaming responses	2023-08-08 21:38:10 -07:00
Jeffrey Morgan	a027a7dd65	add `0.0.0.0` as an allowed origin by default Fixes #282	2023-08-08 13:39:50 -07:00
Bruce MacDonald	21ddcaa1f1	pr comments - default to embeddings enabled - move embedding logic for loaded model to request - allow embedding full directory - close llm on reload	2023-08-08 13:49:37 -04:00
Michael Yang	f2074ed4c0	Merge pull request #306 from jmorganca/default-keep-system automatically set num_keep if num_keep < 0	2023-08-08 09:25:34 -07:00
Bruce MacDonald	a6f6d18f83	embed text document in modelfile	2023-08-08 11:27:17 -04:00
Michael Yang	4dc5b117dd	automatically set num_keep if num_keep < 0 num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions	2023-08-07 16:19:12 -07:00
cmiller01	fb593b7bfc	pass flags to `serve` to allow setting allowed-origins + host and port * resolves: https://github.com/jmorganca/ollama/issues/300 and https://github.com/jmorganca/ollama/issues/282 * example usage: ``` ollama serve --port 9999 --allowed-origins "http://foo.example.com,http://192.0.0.1" ```	2023-08-07 03:34:37 +00:00
Jeffrey Morgan	e3fb1fd3f1	server: compare options correctly	2023-08-03 15:55:40 -04:00
Bruce MacDonald	8b1e791820	allow specifying zero values in modelfile	2023-08-02 17:07:53 -04:00
Jeffrey Morgan	03cff3a225	server: reset digest at end of generate	2023-08-02 16:15:44 -04:00
Bruce MacDonald	8f8b6288ac	check server is running before running command	2023-08-02 10:51:23 -04:00
Bruce MacDonald	765994362c	use head to check heartbeat	2023-08-01 14:50:38 -04:00
Bruce MacDonald	1c5a8770ee	read runner parameter options from map - read runner options from map to see what was specified explicitly and overwrite zero values	2023-08-01 13:38:19 -04:00
Bruce MacDonald	daa0d1de7a	allow specifying zero values in modelfile	2023-08-01 13:37:50 -04:00
Jeffrey Morgan	528bafa585	cache loaded model	2023-08-01 11:24:18 -04:00
Bruce MacDonald	671eec6da9	log prediction failures	2023-07-31 16:46:37 -04:00
Michael Yang	f62a882760	add session expiration	2023-07-27 09:31:44 -07:00
Michael Yang	32aec66e6a	add load duration	2023-07-27 09:31:44 -07:00
Michael Yang	35af37a2cb	session id	2023-07-27 09:31:44 -07:00
Bruce MacDonald	4c1caa3733	download models when creating from modelfile	2023-07-25 14:25:13 -04:00

1 2 3

102 commits