Commit graph

242 commits

Author SHA1 Message Date
Michael Yang
5b84404c64 handle concurrent requests for the same blobs 2023-10-06 12:56:43 -07:00
Michael Yang
8544edca21 parallel chunked downloads 2023-10-06 12:56:43 -07:00
Bruce MacDonald
2130c0708b
output type parsed from modelfile (#678) 2023-10-05 14:58:04 -04:00
Jay Nakrani
1d0ebe67e8
Document response stream chunk delimiter. (#632)
Document response stream chunk delimiter.
2023-09-29 21:45:52 -07:00
Bruce MacDonald
a1b2d95f96
remove unused push/pull params (#650) 2023-09-29 17:27:19 -04:00
Michael Yang
9333b0cc82
Merge pull request #612 from jmorganca/mxyng/prune-empty-directories
prune empty directories
2023-09-29 11:23:39 -07:00
Michael Yang
f40b3de758 use int64 consistently 2023-09-28 11:07:24 -07:00
Michael Yang
8608eb4760 prune empty directories 2023-09-27 10:58:09 -07:00
Jeffrey Morgan
9b12a511ca check other request fields before load short circuit in /api/generate 2023-09-22 23:50:55 -04:00
Bruce MacDonald
5d71bda478
close llm on interrupt (#577) 2023-09-22 19:41:52 +01:00
Michael Yang
82f5b66c01 register HEAD /api/tags 2023-09-21 16:38:03 -07:00
Michael Yang
c986694367 fix HEAD / request
HEAD request should respond like their GET counterparts except without a
response body.
2023-09-21 16:35:58 -07:00
Bruce MacDonald
4cba75efc5
remove tmp directories created by previous servers (#559)
* remove tmp directories created by previous servers

* clean up on server stop

* Update routes.go

* Update server/routes.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* create top-level temp ollama dir

* check file exists before creating

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-21 20:38:49 +01:00
Michael Yang
1fabba474b refactor default allow origins
this should be less error prone
2023-09-21 09:42:25 -07:00
Michael Yang
ee4fd16f2c
Merge pull request #556 from jmorganca/pack-cuda
pack in cuda libs
2023-09-20 15:02:36 -07:00
Bruce MacDonald
1255bc9b45 only package 11.8 runner 2023-09-20 20:00:41 +01:00
Michael Yang
499e9007a5 pick chunksize based on location 2023-09-20 11:10:24 -07:00
Michael Yang
aa45d7c1df draft: explicitly follow upload redirects 2023-09-19 13:36:58 -07:00
Michael Yang
a5520bfb42 fix build 2023-09-19 10:42:24 -07:00
Michael Yang
b58d5d16b0 fix mkdir on windows 2023-09-19 09:41:13 -07:00
Patrick Devine
24580df958
only add a layer if there is actual data (#535) 2023-09-18 13:47:45 -07:00
Patrick Devine
80dd44e80a
Cmd changes (#541) 2023-09-18 12:26:56 -07:00
Michael Yang
08d7c2a944 fix error on upload chunk 2023-09-15 15:59:30 -07:00
Michael Yang
e53bc57d4d split uploadBlobChunked 2023-09-14 17:22:05 -07:00
Michael Yang
f0b398d17f implement ProgressWriter 2023-09-14 17:22:04 -07:00
Michael Yang
daa4f096f9 set request.ContentLength
This informs the HTTP client the content length is known and disables
chunked Transfer-Encoding
2023-09-14 13:32:44 -07:00
Michael Yang
e6881cabd0 remove unused 2023-09-13 14:48:33 -07:00
Michael Yang
0c5a454361 fix model type for 70b 2023-09-12 15:12:59 -07:00
Michael Yang
7dee25a07f fix falcon decode
get model and file type from bin file
2023-09-12 12:34:53 -07:00
Bruce MacDonald
f221637053
first pass at linux gpu support (#454)
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-12 11:04:35 -04:00
Patrick Devine
45ac07cd02
create the blobs directory correctly (#508) 2023-09-11 14:54:52 -07:00
Patrick Devine
e7e91cd71c
add autoprune to remove unused layers (#491) 2023-09-11 11:46:35 -07:00
Jeffrey Morgan
3920e15386
add model format to config layer (#497) 2023-09-09 17:53:44 -04:00
Michael Yang
de227b620f fix nil pointer dereference 2023-09-07 17:24:31 -07:00
Michael Yang
738fe9c4aa
Merge pull request #486 from jmorganca/mxyng/fix-push
fix: retry push on expired token
2023-09-07 13:58:34 -07:00
Michael Yang
bf146fb072 fix retry on unauthorized chunk 2023-09-07 12:02:04 -07:00
Michael Yang
f0f4943577 fix get auth token 2023-09-07 12:01:56 -07:00
Bruce MacDonald
09dd2aeff9
GGUF support (#441) 2023-09-07 13:55:37 -04:00
Michael Yang
83c6be1666
fix model manifests (#477) 2023-09-06 17:30:08 -04:00
Patrick Devine
790d24eb7b
add show command (#474) 2023-09-06 11:04:17 -07:00
Michael Yang
a1ecdd36d5 create manifests directory 2023-09-05 17:10:40 -07:00
Michael Yang
d1c2558f7e
Merge pull request #461 from jmorganca/mxyng/fix-inherit-params
fix inherit params
2023-09-05 12:30:23 -07:00
Michael Yang
06ef90c051 fix parameter inheritence
parameters are not inherited because they are processed differently from
other layer. fix this by explicitly merging the inherited params into
the new params. parameter values defined in the new modelfile will
override those defined in the inherited modelfile. array lists are
replaced instead of appended
2023-09-05 11:40:20 -07:00
Michael Yang
e9f6df7dca use slices.DeleteFunc 2023-09-05 09:56:59 -07:00
Michael Yang
681f3c4c42 fix num_keep 2023-09-03 17:47:49 -04:00
Quinn Slack
62d29b2157 do not HTML-escape prompt
The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `&lt;h1&gt;hello` before being passed to the LLM.

The included test case passes, but before the code change, it failed:

```
--- FAIL: TestModelPrompt
    images_test.go:21: got "a&lt;h1&gt;b", want "a<h1>b"
```
2023-09-01 17:16:38 -05:00
Michael Yang
1c8fd627ad windows: fix create modelfile 2023-08-31 09:47:10 -04:00
Michael Yang
ae950b00f1 windows: fix delete 2023-08-31 09:47:10 -04:00
Michael Yang
eeb40a672c fix list models for windows 2023-08-31 09:47:10 -04:00
Michael Yang
0f541a0367 s/ListResponseModel/ModelResponse/ 2023-08-31 09:47:10 -04:00