Bruce MacDonald
4b3507f036
embeddings endpoint
...
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-10 11:45:57 -04:00
Bruce MacDonald
7a5f3616fd
embed text document in modelfile
2023-08-09 10:26:19 -04:00
Bruce MacDonald
21ddcaa1f1
pr comments
...
- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload
2023-08-08 13:49:37 -04:00
Michael Yang
f2074ed4c0
Merge pull request #306 from jmorganca/default-keep-system
...
automatically set num_keep if num_keep < 0
2023-08-08 09:25:34 -07:00
Jeffrey Morgan
8713ac23a8
allow overriding template
and system
in /api/generate
...
Fixes #297
Fixes #296
2023-08-08 00:55:34 -04:00
Michael Yang
4dc5b117dd
automatically set num_keep if num_keep < 0
...
num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions
2023-08-07 16:19:12 -07:00
Michael Yang
b9f4d67554
configurable rope frequency parameters
2023-08-03 22:11:58 -07:00
Bruce MacDonald
8b1e791820
allow specifying zero values in modelfile
2023-08-02 17:07:53 -04:00
Bruce MacDonald
8f8b6288ac
check server is running before running command
2023-08-02 10:51:23 -04:00
Bruce MacDonald
765994362c
use head to check heartbeat
2023-08-01 14:50:38 -04:00
Bruce MacDonald
1c5a8770ee
read runner parameter options from map
...
- read runner options from map to see what was specified explicitly and overwrite zero values
2023-08-01 13:38:19 -04:00
Jeffrey Morgan
528bafa585
cache loaded model
2023-08-01 11:24:18 -04:00
Bruce MacDonald
e72fe7945f
check server is running before running command
2023-07-31 16:25:57 -04:00
Bruce MacDonald
184ad8f057
allow specifying stop conditions in modelfile
2023-07-28 11:02:04 -04:00
Jeffrey Morgan
822a0e36eb
lower batch size to 512
2023-07-28 10:56:21 -04:00
Michael Yang
fadf75f99d
add stop conditions
2023-07-27 17:00:47 -07:00
Michael Yang
ad3a7d0e2c
add NumGQA
2023-07-27 14:05:11 -07:00
Jeffrey Morgan
688661ab9b
increase default batch size to 1024
2023-07-27 16:51:01 -04:00
Michael Yang
cca61181cb
sample metrics
2023-07-27 09:31:44 -07:00
Michael Yang
c490416189
lock on llm.lock(); decrease batch size
2023-07-27 09:31:44 -07:00
Michael Yang
f62a882760
add session expiration
2023-07-27 09:31:44 -07:00
Michael Yang
3003fc03fc
update predict code
2023-07-27 09:31:44 -07:00
Michael Yang
32aec66e6a
add load duration
2023-07-27 09:31:44 -07:00
Michael Yang
35af37a2cb
session id
2023-07-27 09:31:44 -07:00
Bruce MacDonald
4c1caa3733
download models when creating from modelfile
2023-07-25 14:25:13 -04:00
Bruce MacDonald
536028c35a
better error message when model not found on pull
2023-07-24 17:48:17 -04:00
Patrick Devine
4cb42ca55e
add copy command ( #191 )
2023-07-24 11:27:28 -04:00
Patrick Devine
6d6b0d3321
change error handler behavior and fix error when a model isn't found ( #173 )
2023-07-21 23:02:12 -07:00
Patrick Devine
9f6e97865c
allow pushing/pulling to insecure registries ( #157 )
2023-07-21 15:42:19 -07:00
Bruce MacDonald
7ba1308595
Merge pull request #147 from jmorganca/brucemacd/cli-err-display
...
Improve CLI error display
2023-07-21 16:10:19 +02:00
Patrick Devine
e7a393de54
add rm command for models ( #151 )
2023-07-20 16:09:23 -07:00
Michael Yang
1f27d7f1b8
fix stream errors
2023-07-20 12:12:08 -07:00
Bruce MacDonald
ebaa33ac28
display gin api errors in cli
2023-07-20 20:45:12 +02:00
Michael Yang
68df36ae50
fix pull 0 bytes on completed layer
2023-07-18 19:38:11 -07:00
Patrick Devine
5bea29f610
add new list command ( #97 )
2023-07-18 09:09:45 -07:00
Patrick Devine
2fb52261ad
basic distribution w/ push/pull ( #78 )
...
* basic distribution w/ push/pull
* add the parser
* add create, pull, and push
* changes to the parser, FROM line, and fix commands
* mkdirp new manifest directories
* make `blobs` directory if it does not exist
* fix go warnings
* add progressbar for model pulls
* move model struct
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-07-16 17:02:22 -07:00
Michael Yang
965f9ad033
Merge pull request #77 from jmorganca/mem
...
continue conversation
2023-07-14 14:57:42 -07:00
Michael Yang
5fefaa5d4d
fix typo
2023-07-14 10:47:18 -07:00
Michael Yang
1775647f76
continue conversation
...
feed responses back into the llm
2023-07-13 17:13:00 -07:00
Michael Yang
05e08d2310
return more info in generate response
2023-07-13 09:37:32 -07:00
Michael Yang
e243329e2e
check api status
2023-07-11 13:42:05 -07:00
Michael Yang
fd4792ec56
call llama.cpp directly from go
2023-07-11 11:59:18 -07:00
Jeffrey Morgan
a3ec1ec2a0
consistent error handling for pull and generate
2023-07-10 21:34:15 -07:00
Michael Yang
edba935d67
return error in generate response
2023-07-10 13:30:10 -07:00
Bruce MacDonald
2d49197b3b
increase default model size to 512
2023-07-10 21:24:41 +02:00
Bruce MacDonald
f5e2e150b8
allow overriding default generate options
2023-07-10 20:58:02 +02:00
Bruce MacDonald
f533f85d44
pr feedback
...
- move error check to api client pull
- simplify error check in generate
- return nil on any pull error
2023-07-07 17:12:02 -04:00
Bruce MacDonald
61dd87bd90
if directory cannot be resolved, do not fail
2023-07-07 15:27:43 -04:00
Michael Yang
303982b56e
fix run generate
2023-07-07 11:36:29 -07:00
Patrick Devine
3f1b7177f2
pass model and predict options
2023-07-07 09:34:05 -07:00
Michael Yang
291bb97e3d
client request options
2023-07-06 17:08:28 -07:00
Michael Yang
b0e63bfb4c
simplify api client
2023-07-06 17:07:40 -07:00
Michael Yang
c4b9e84945
progress
2023-07-06 17:07:40 -07:00
Michael Yang
3d6009aae3
run prompts
2023-07-06 17:07:40 -07:00
Michael Yang
0637632258
simple pull response
2023-07-06 16:34:44 -04:00
Michael Yang
dd960d1d5e
update generate response
2023-07-06 16:34:44 -04:00
Bruce MacDonald
c9f45abef3
resumable downloads
2023-07-06 16:34:44 -04:00
Bruce MacDonald
7cf5905063
display pull progress
2023-07-06 16:34:44 -04:00
Michael Yang
5079282120
tcp socket
2023-07-06 16:34:44 -04:00
Michael Yang
68e6b4550c
use prompt templates
2023-07-06 16:34:44 -04:00
Bruce MacDonald
a6494f8211
pull models
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
fd962a36e5
client updates
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
6093a88c1a
add llama.cpp go bindings
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
76cb60d496
wip go engine
...
Co-authored-by: Patrick Devine <pdevine@sonic.net>
2023-07-06 16:34:44 -04:00