Commit graph

97 commits

Author SHA1 Message Date
Michael Yang
32d1a00017 remove unused requestContextKey 2023-08-22 10:49:54 -07:00
Michael Yang
04e2128273 move upload funcs to upload.go 2023-08-22 10:49:53 -07:00
Michael Yang
2cc634689b use url.URL 2023-08-22 10:49:07 -07:00
Michael Yang
9ec7e37534
Merge pull request #392 from jmorganca/mxyng/version
add version
2023-08-22 09:50:25 -07:00
Michael Yang
2c7f956b38 add version 2023-08-22 09:40:58 -07:00
Jeffrey Morgan
a9f6c56652 fix FROM instruction erroring when referring to a file 2023-08-22 09:39:42 -07:00
Ryan Baker
0a892419ad
Strip protocol from model path (#377) 2023-08-21 21:56:56 -07:00
Michael Yang
3b49315f97 retry on unauthorized chunk push
The token printed for authorized requests has a lifetime of 1h. If an
upload exceeds 1h, a chunk push will fail since the token is created on
a "start upload" request.

This replaces the Pipe with SectionReader which is simpler and
implements Seek, a requirement for makeRequestWithRetry. This is
slightly worse than using a Pipe since the progress update is directly
tied to the chunk size instead of controlled separately.
2023-08-18 11:23:47 -07:00
Michael Yang
7eda70f23b copy metadata from source 2023-08-17 21:55:25 -07:00
Michael Yang
086449b6c7 fmt 2023-08-17 15:32:31 -07:00
Michael Yang
3cbc6a5c01 fix push manifest 2023-08-17 15:28:12 -07:00
Michael Yang
a894cc792d model and file type as strings 2023-08-17 12:08:04 -07:00
Michael Yang
b963a83559
Merge pull request #364 from jmorganca/chunked-uploads
reimplement chunked uploads
2023-08-17 09:58:51 -07:00
Michael Yang
bf6688abe6
Merge pull request #360 from jmorganca/fix-request-copies
Fix request copies
2023-08-17 09:58:42 -07:00
Bruce MacDonald
6005b157c2
retry download on network errors 2023-08-17 10:31:45 -04:00
Michael Yang
5dfe91be8b reimplement chunked uploads 2023-08-16 14:50:24 -07:00
Michael Yang
9f944c00f1 push: retry on unauthorized 2023-08-16 11:35:33 -07:00
Michael Yang
56e87cecb1 images: remove body copies 2023-08-16 10:30:41 -07:00
Michael Yang
5d9a4cd251
Merge pull request #348 from jmorganca/cross-repo-mount
cross repo blob mount
2023-08-16 09:20:36 -07:00
Bruce MacDonald
1deb35ca64
use loaded llm for generating model file embeddings 2023-08-15 16:12:02 -03:00
Bruce MacDonald
e2de886831
do not regenerate embeddings 2023-08-15 16:10:22 -03:00
Bruce MacDonald
f0d7c2f5ea retry download on network errors 2023-08-15 15:07:19 -03:00
Bruce MacDonald
326de48930 use loaded llm for embeddings 2023-08-15 10:50:54 -03:00
Bruce MacDonald
18f2cb0472 dont log fatal 2023-08-15 10:39:59 -03:00
Michael Yang
e26085b921 close open files 2023-08-14 16:08:06 -07:00
Michael Yang
f594c8eb91 cross repo mount 2023-08-14 15:07:35 -07:00
Bruce MacDonald
2c8b680b03 use file info for embeddings cache 2023-08-14 12:11:04 -03:00
Bruce MacDonald
99b6b60085 use model bin digest for embed digest 2023-08-14 11:57:12 -03:00
Bruce MacDonald
e9a9580bdd do not regenerate embeddings
- re-use previously evaluated embeddings when possible
- change embeddings digest identifier to be based on model name and embedded file path
2023-08-14 10:34:17 -03:00
Patrick Devine
d9cf18e28d
add maximum retries when pushing (#334) 2023-08-11 15:41:55 -07:00
Michael Yang
6517bcc53c
Merge pull request #290 from jmorganca/add-adapter-layers
implement loading ggml lora adapters through the modelfile
2023-08-10 17:23:01 -07:00
Michael Yang
6a6828bddf
Merge pull request #167 from jmorganca/decode-ggml
partial decode ggml bin for more info
2023-08-10 17:22:40 -07:00
Patrick Devine
be989d89d1
Token auth (#314) 2023-08-10 11:34:25 -07:00
Michael Yang
6de5d032e1 implement loading ggml lora adapters through the modelfile 2023-08-10 09:23:39 -07:00
Michael Yang
fccf8d179f partial decode ggml bin for more info 2023-08-10 09:23:10 -07:00
Bruce MacDonald
984c9c628c fix embeddings invalid values 2023-08-09 16:50:53 -04:00
Bruce MacDonald
ac971c56d1 Update images.go 2023-08-09 11:31:54 -04:00
Bruce MacDonald
868e3b31c7 allow for concurrent pulls of the same files 2023-08-09 11:31:54 -04:00
Bruce MacDonald
1bee2347be pr feedback
- defer closing llm on embedding
- do not override licenses
- remove debugging print line
- reformat model file docs
2023-08-08 17:01:37 -04:00
Bruce MacDonald
884d78ceb3 allow embedding from model binary 2023-08-08 14:38:57 -04:00
Bruce MacDonald
21ddcaa1f1 pr comments
- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload
2023-08-08 13:49:37 -04:00
Bruce MacDonald
a6f6d18f83 embed text document in modelfile 2023-08-08 11:27:17 -04:00
Jeffrey Morgan
8713ac23a8 allow overriding template and system in /api/generate
Fixes #297
Fixes #296
2023-08-08 00:55:34 -04:00
Michael Yang
a71ff3f6a2 use a pipe to push to registry with progress
switch to a monolithic upload instead of a chunk upload through a pipe
to report progress
2023-08-03 10:37:13 -07:00
Bruce MacDonald
1c5a8770ee read runner parameter options from map
- read runner options from map to see what was specified explicitly and overwrite zero values
2023-08-01 13:38:19 -04:00
Bruce MacDonald
daa0d1de7a allow specifying zero values in modelfile 2023-08-01 13:37:50 -04:00
Jeffrey Morgan
528bafa585 cache loaded model 2023-08-01 11:24:18 -04:00
Michael Yang
872011630a fix license 2023-07-31 21:46:48 -07:00
Michael Yang
203fdbc4b8 check err 2023-07-31 21:46:48 -07:00
Michael Yang
70e0ab6b3d remove unnecessary fmt.Sprintf 2023-07-31 21:46:47 -07:00