Commit graph

2039 commits

Author SHA1 Message Date
Bruce MacDonald
fc6ec356fc remove libcuda.so 2023-09-20 20:36:14 +01:00
Bruce MacDonald
1255bc9b45 only package 11.8 runner 2023-09-20 20:00:41 +01:00
Michael Yang
084e4c782a
Merge pull request #557 from jmorganca/mxyng/cleanup
fix impossible condition
2023-09-20 11:51:01 -07:00
Michael Yang
58ffa03d8b fix impossible condition 2023-09-20 11:27:44 -07:00
Michael Yang
637f8bc6a5
Merge pull request #536 from jmorganca/mxyng/redirect-uploads
explicitly follow upload redirects
2023-09-20 11:27:03 -07:00
Michael Yang
499e9007a5 pick chunksize based on location 2023-09-20 11:10:24 -07:00
Bruce MacDonald
b9bb5ca288 use cuda_version 2023-09-20 17:58:16 +01:00
Bruce MacDonald
4e8be787c7 pack in cuda libs 2023-09-20 17:40:42 +01:00
Michael Yang
aa45d7c1df draft: explicitly follow upload redirects 2023-09-19 13:36:58 -07:00
Michael Yang
e35565c567
Merge pull request #555 from jmorganca/mxyng/fix-windows-startup
fix build
2023-09-19 10:51:58 -07:00
Michael Yang
a5520bfb42 fix build 2023-09-19 10:42:24 -07:00
Michael Yang
2627c464ba
Merge pull request #554 from jmorganca/mxyng/fix-windows-startup
fix mkdir on windows
2023-09-19 09:42:12 -07:00
Michael Yang
b58d5d16b0 fix mkdir on windows 2023-09-19 09:41:13 -07:00
Patrick Devine
24580df958
only add a layer if there is actual data (#535) 2023-09-18 13:47:45 -07:00
Patrick Devine
80dd44e80a
Cmd changes (#541) 2023-09-18 12:26:56 -07:00
James Braza
94e1d96b29
Updated README section on community projects for table (#550) 2023-09-18 15:22:50 -04:00
Bruce MacDonald
66003e1d05
subprocess improvements (#524)
* subprocess improvements

- increase start-up timeout
- when runner fails to start fail rather than timing out
- try runners in order rather than choosing 1 runner
- embed metal runner in metal dir rather than gpu
- refactor logging and error messages

* Update llama.go

* Update llama.go

* simplify by using glob
2023-09-18 15:16:32 -04:00
Michael Yang
c345053a8b
Merge pull request #537 from jmorganca/mxyng/upload
fix error on upload chunk
2023-09-15 17:48:39 -07:00
Michael Yang
08d7c2a944 fix error on upload chunk 2023-09-15 15:59:30 -07:00
Michael Yang
bc9573dcb1
Merge pull request #530 from jmorganca/mxyng/progresswriter
implement ProgressWriter
2023-09-15 12:43:46 -07:00
Michael Yang
e53bc57d4d split uploadBlobChunked 2023-09-14 17:22:05 -07:00
Michael Yang
f0b398d17f implement ProgressWriter 2023-09-14 17:22:04 -07:00
Patrick Devine
8efbc5df55
DRAFT: add a simple python client to access ollama (#522) 2023-09-14 16:37:38 -07:00
Michael Yang
ccc3e9ac6d
Merge pull request #531 from jmorganca/mxyng/content-length
set request.ContentLength
2023-09-14 13:33:11 -07:00
Michael Yang
daa4f096f9 set request.ContentLength
This informs the HTTP client the content length is known and disables
chunked Transfer-Encoding
2023-09-14 13:32:44 -07:00
Michael Yang
3ee85f1c6c
Merge pull request #526 from jmorganca/mxyng/cleanup
remove unused
2023-09-14 13:10:59 -07:00
Bruce MacDonald
2540c9181c
support for packaging in multiple cuda runners (#509)
* enable packaging multiple cuda versions
* use nvcc cuda version if available

---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-14 15:08:13 -04:00
Michael Yang
83ffb154bc
Merge pull request #507 from jmorganca/mxyng/build
update docker image
2023-09-14 11:25:59 -07:00
Michael Yang
9aa192c812 update cuda docker image 2023-09-14 11:25:20 -07:00
Matt Williams
fc8707686f
Update API docs (#527)
* Update API docs

Signed-off-by: Matt Williams <m@technovangelist.com>

* strange TOC was getting auto generated

Signed-off-by: Matt Williams <m@technovangelist.com>

* Update docs/api.md

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

* Update docs/api.md

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

* Update docs/api.md

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>

* Update api.md

---------

Signed-off-by: Matt Williams <m@technovangelist.com>
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
Co-authored-by: Michael Chiang <mchiang0610@users.noreply.github.com>
2023-09-14 08:51:26 -07:00
Michael Yang
f89c23764b
Merge pull request #525 from jmorganca/mxyng/falcon-decode
fix: add falcon.go
2023-09-13 15:08:47 -07:00
Michael Yang
e6881cabd0 remove unused 2023-09-13 14:48:33 -07:00
Michael Yang
d028853879 fix: add falcon.go 2023-09-13 14:47:37 -07:00
Michael Yang
949553db23
Merge pull request #519 from jmorganca/mxyng/decode
Mxyng/decode
2023-09-13 12:43:57 -07:00
Michael Yang
0c5a454361 fix model type for 70b 2023-09-12 15:12:59 -07:00
Bruce MacDonald
f59c4d03f7
fix ggml arm64 cuda build (#520) 2023-09-12 17:06:48 -04:00
Michael Yang
7dee25a07f fix falcon decode
get model and file type from bin file
2023-09-12 12:34:53 -07:00
Bruce MacDonald
f221637053
first pass at linux gpu support (#454)
* linux gpu support
* handle multiple gpus
* add cuda docker image (#488)
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-12 11:04:35 -04:00
Patrick Devine
45ac07cd02
create the blobs directory correctly (#508) 2023-09-11 14:54:52 -07:00
Jeffrey Morgan
7d749cc787 fix darwin build script 2023-09-11 16:31:46 -04:00
Patrick Devine
e7e91cd71c
add autoprune to remove unused layers (#491) 2023-09-11 11:46:35 -07:00
Jeffrey Morgan
3920e15386
add model format to config layer (#497) 2023-09-09 17:53:44 -04:00
Michael Yang
41e976edde
Merge pull request #492 from jmorganca/mxyng/nil-pointer
fix nil pointer dereference
2023-09-07 17:25:23 -07:00
Michael Yang
de227b620f fix nil pointer dereference 2023-09-07 17:24:31 -07:00
Michael Yang
63def6ca49
Merge pull request #487 from jmorganca/mxyng/dockerignore
update dockerignore
2023-09-07 14:16:17 -07:00
Michael Yang
738fe9c4aa
Merge pull request #486 from jmorganca/mxyng/fix-push
fix: retry push on expired token
2023-09-07 13:58:34 -07:00
Michael Yang
a8da0bacbe update dockerignore 2023-09-07 13:36:25 -07:00
Michael Yang
bf146fb072 fix retry on unauthorized chunk 2023-09-07 12:02:04 -07:00
Michael Yang
f0f4943577 fix get auth token 2023-09-07 12:01:56 -07:00
Bruce MacDonald
09dd2aeff9
GGUF support (#441) 2023-09-07 13:55:37 -04:00