Michael Yang
|
c5bcf32823
|
update llama.cpp
|
2023-08-03 11:50:24 -07:00 |
|
Michael Yang
|
74a5f7e698
|
no gpu for 70B model
|
2023-08-01 17:12:50 -07:00 |
|
Michael Yang
|
319f078dd9
|
remove -Werror
there are compile warnings on Linux which -Werror elevates to errors,
preventing compile
|
2023-07-31 21:45:56 -07:00 |
|
Jeffrey Morgan
|
7da249fcc1
|
only build metal for darwin,arm target
|
2023-07-31 21:35:23 -04:00 |
|
Bruce MacDonald
|
184ad8f057
|
allow specifying stop conditions in modelfile
|
2023-07-28 11:02:04 -04:00 |
|
Michael Yang
|
3549676678
|
embed ggml-metal.metal
|
2023-07-27 17:23:29 -07:00 |
|
Michael Yang
|
fadf75f99d
|
add stop conditions
|
2023-07-27 17:00:47 -07:00 |
|
Michael Yang
|
ad3a7d0e2c
|
add NumGQA
|
2023-07-27 14:05:11 -07:00 |
|
Michael Yang
|
cca61181cb
|
sample metrics
|
2023-07-27 09:31:44 -07:00 |
|
Michael Yang
|
c490416189
|
lock on llm.lock(); decrease batch size
|
2023-07-27 09:31:44 -07:00 |
|
Michael Yang
|
f62a882760
|
add session expiration
|
2023-07-27 09:31:44 -07:00 |
|
Michael Yang
|
3003fc03fc
|
update predict code
|
2023-07-27 09:31:44 -07:00 |
|
Michael Yang
|
35af37a2cb
|
session id
|
2023-07-27 09:31:44 -07:00 |
|
Michael Yang
|
726bc647b2
|
enable k quants
|
2023-07-25 08:39:58 -07:00 |
|
Michael Yang
|
cb55fa9270
|
enable accelerate
|
2023-07-24 17:14:45 -07:00 |
|
Michael Yang
|
b71c67b6ba
|
allocate a large enough tokens slice
|
2023-07-21 23:05:15 -07:00 |
|
Michael Yang
|
40c9dc0a31
|
fix multibyte responses
|
2023-07-14 20:11:44 -07:00 |
|
Michael Yang
|
0142660bd4
|
size_t
|
2023-07-14 17:29:16 -07:00 |
|
Michael Yang
|
1775647f76
|
continue conversation
feed responses back into the llm
|
2023-07-13 17:13:00 -07:00 |
|
Michael Yang
|
05e08d2310
|
return more info in generate response
|
2023-07-13 09:37:32 -07:00 |
|
Michael Yang
|
e1f0a0dc74
|
fix eof error in generate
|
2023-07-12 09:36:16 -07:00 |
|
Jeffrey Morgan
|
c63f811909
|
return error if model fails to load
|
2023-07-11 20:32:26 -07:00 |
|
Michael Yang
|
442dec1c6f
|
vendor llama.cpp
|
2023-07-11 11:59:18 -07:00 |
|
Michael Yang
|
fd4792ec56
|
call llama.cpp directly from go
|
2023-07-11 11:59:18 -07:00 |
|
Jeffrey Morgan
|
5fb96255dc
|
llama: remove unused helper functions
|
2023-07-09 10:25:07 -04:00 |
|
Patrick Devine
|
3f1b7177f2
|
pass model and predict options
|
2023-07-07 09:34:05 -07:00 |
|
Michael Yang
|
5dc9c8ff23
|
more free
|
2023-07-06 17:08:03 -07:00 |
|
Bruce MacDonald
|
da74384a3e
|
remove prompt cache
|
2023-07-06 17:49:05 -04:00 |
|
Michael Yang
|
2c80eddd71
|
more free
|
2023-07-06 16:34:44 -04:00 |
|
Jeffrey Morgan
|
9fe018675f
|
use Makefile for dependency building instead of go generate
|
2023-07-06 16:34:44 -04:00 |
|
Jeffrey Morgan
|
0998d4f0a4
|
remove debug print statements
|
2023-07-06 16:34:44 -04:00 |
|
Jeffrey Morgan
|
79a999e95d
|
fix crash in bindings
|
2023-07-06 16:34:44 -04:00 |
|
Jeffrey Morgan
|
fd962a36e5
|
client updates
|
2023-07-06 16:34:44 -04:00 |
|
Jeffrey Morgan
|
0240165388
|
fix llama.cpp build
|
2023-07-06 16:34:44 -04:00 |
|
Jeffrey Morgan
|
6093a88c1a
|
add llama.cpp go bindings
|
2023-07-06 16:34:44 -04:00 |
|