ollama

Author	SHA1	Message	Date
Jeffrey Morgan	9b12a511ca	check other request fields before load short circuit in `/api/generate`	2023-09-22 23:50:55 -04:00
Bruce MacDonald	5d71bda478	close llm on interrupt (#577 )	2023-09-22 19:41:52 +01:00
Michael Yang	82f5b66c01	register HEAD /api/tags	2023-09-21 16:38:03 -07:00
Michael Yang	c986694367	fix HEAD / request HEAD request should respond like their GET counterparts except without a response body.	2023-09-21 16:35:58 -07:00
Bruce MacDonald	4cba75efc5	remove tmp directories created by previous servers (#559 ) * remove tmp directories created by previous servers * clean up on server stop * Update routes.go * Update server/routes.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * create top-level temp ollama dir * check file exists before creating --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-21 20:38:49 +01:00
Michael Yang	1fabba474b	refactor default allow origins this should be less error prone	2023-09-21 09:42:25 -07:00
Michael Yang	ee4fd16f2c	Merge pull request #556 from jmorganca/pack-cuda pack in cuda libs	2023-09-20 15:02:36 -07:00
Bruce MacDonald	1255bc9b45	only package 11.8 runner	2023-09-20 20:00:41 +01:00
Michael Yang	499e9007a5	pick chunksize based on location	2023-09-20 11:10:24 -07:00
Michael Yang	aa45d7c1df	draft: explicitly follow upload redirects	2023-09-19 13:36:58 -07:00
Michael Yang	a5520bfb42	fix build	2023-09-19 10:42:24 -07:00
Michael Yang	b58d5d16b0	fix mkdir on windows	2023-09-19 09:41:13 -07:00
Patrick Devine	24580df958	only add a layer if there is actual data (#535 )	2023-09-18 13:47:45 -07:00
Patrick Devine	80dd44e80a	Cmd changes (#541 )	2023-09-18 12:26:56 -07:00
Michael Yang	08d7c2a944	fix error on upload chunk	2023-09-15 15:59:30 -07:00
Michael Yang	e53bc57d4d	split uploadBlobChunked	2023-09-14 17:22:05 -07:00
Michael Yang	f0b398d17f	implement ProgressWriter	2023-09-14 17:22:04 -07:00
Michael Yang	daa4f096f9	set request.ContentLength This informs the HTTP client the content length is known and disables chunked Transfer-Encoding	2023-09-14 13:32:44 -07:00
Michael Yang	e6881cabd0	remove unused	2023-09-13 14:48:33 -07:00
Michael Yang	0c5a454361	fix model type for 70b	2023-09-12 15:12:59 -07:00
Michael Yang	7dee25a07f	fix falcon decode get model and file type from bin file	2023-09-12 12:34:53 -07:00
Bruce MacDonald	f221637053	first pass at linux gpu support (#454 ) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me>	2023-09-12 11:04:35 -04:00
Patrick Devine	45ac07cd02	create the blobs directory correctly (#508 )	2023-09-11 14:54:52 -07:00
Patrick Devine	e7e91cd71c	add autoprune to remove unused layers (#491 )	2023-09-11 11:46:35 -07:00
Jeffrey Morgan	3920e15386	add model format to config layer (#497 )	2023-09-09 17:53:44 -04:00
Michael Yang	de227b620f	fix nil pointer dereference	2023-09-07 17:24:31 -07:00
Michael Yang	738fe9c4aa	Merge pull request #486 from jmorganca/mxyng/fix-push fix: retry push on expired token	2023-09-07 13:58:34 -07:00
Michael Yang	bf146fb072	fix retry on unauthorized chunk	2023-09-07 12:02:04 -07:00
Michael Yang	f0f4943577	fix get auth token	2023-09-07 12:01:56 -07:00
Bruce MacDonald	09dd2aeff9	GGUF support (#441 )	2023-09-07 13:55:37 -04:00
Michael Yang	83c6be1666	fix model manifests (#477 )	2023-09-06 17:30:08 -04:00
Patrick Devine	790d24eb7b	add show command (#474 )	2023-09-06 11:04:17 -07:00
Michael Yang	a1ecdd36d5	create manifests directory	2023-09-05 17:10:40 -07:00
Michael Yang	d1c2558f7e	Merge pull request #461 from jmorganca/mxyng/fix-inherit-params fix inherit params	2023-09-05 12:30:23 -07:00
Michael Yang	06ef90c051	fix parameter inheritence parameters are not inherited because they are processed differently from other layer. fix this by explicitly merging the inherited params into the new params. parameter values defined in the new modelfile will override those defined in the inherited modelfile. array lists are replaced instead of appended	2023-09-05 11:40:20 -07:00
Michael Yang	e9f6df7dca	use slices.DeleteFunc	2023-09-05 09:56:59 -07:00
Michael Yang	681f3c4c42	fix num_keep	2023-09-03 17:47:49 -04:00
Quinn Slack	62d29b2157	do not HTML-escape prompt The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `<h1>hello` before being passed to the LLM. The included test case passes, but before the code change, it failed: ``` --- FAIL: TestModelPrompt images_test.go:21: got "a<h1>b", want "a<h1>b" ```	2023-09-01 17:16:38 -05:00
Michael Yang	1c8fd627ad	windows: fix create modelfile	2023-08-31 09:47:10 -04:00
Michael Yang	ae950b00f1	windows: fix delete	2023-08-31 09:47:10 -04:00
Michael Yang	eeb40a672c	fix list models for windows	2023-08-31 09:47:10 -04:00
Michael Yang	0f541a0367	s/ListResponseModel/ModelResponse/	2023-08-31 09:47:10 -04:00
Bruce MacDonald	42998d797d	subprocess llama.cpp server (#401 ) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm	2023-08-30 16:35:03 -04:00
Quinn Slack	f4432e1dba	treat stop as stop sequences, not exact tokens (#442 ) The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list. Fixes https://github.com/jmorganca/ollama/issues/295.	2023-08-30 11:53:42 -04:00
Michael Yang	982c535428	Merge pull request #428 from jmorganca/mxyng/upload-chunks update upload chunks	2023-08-30 07:47:17 -07:00
Patrick Devine	8bbff2df98	add model IDs (#439 )	2023-08-28 20:50:24 -07:00
Michael Yang	16b06699fd	remove unused parameter	2023-08-28 18:35:18 -04:00
Michael Yang	246dc65417	loosen http status code checks	2023-08-28 18:34:53 -04:00
Michael Yang	865fceb73c	chunked pipe	2023-08-28 18:34:53 -04:00
Michael Yang	72266c7684	bump chunk size to 95MB	2023-08-28 18:34:53 -04:00

1 2 3 4 5

234 commits