Michael Yang
743e957d88
use filepath for os compat
2023-07-14 17:27:14 -07:00
Michael Yang
5ade3db040
fix race
...
block on write which only returns when the channel is closed. this is
contrary to the previous arrangement where the handler may return but
the stream hasn't finished writing. it can lead to the client receiving
unexpected responses (since the request has been handled) or worst case
a nil-pointer dereference as the stream tries to flush a nil writer
2023-07-14 15:10:46 -07:00
Michael Yang
1775647f76
continue conversation
...
feed responses back into the llm
2023-07-13 17:13:00 -07:00
Michael Yang
05e08d2310
return more info in generate response
2023-07-13 09:37:32 -07:00
Michael Yang
31590284a7
fix route
2023-07-12 19:21:49 -07:00
Michael Yang
2666d3c206
fix pull race
2023-07-12 19:07:23 -07:00
Michael Yang
0944b01e7d
pull fixes
2023-07-12 09:55:07 -07:00
Michael Yang
a806b03f62
no errgroup
2023-07-11 14:58:10 -07:00
Michael Yang
e243329e2e
check api status
2023-07-11 13:42:05 -07:00
Michael Yang
2a66a1164a
common stream producer
2023-07-11 13:42:05 -07:00
Michael Yang
fd4792ec56
call llama.cpp directly from go
2023-07-11 11:59:18 -07:00
Jeffrey Morgan
a3ec1ec2a0
consistent error handling for pull and generate
2023-07-10 21:34:15 -07:00
Michael Yang
edba935d67
return error in generate response
2023-07-10 13:30:10 -07:00
Bruce MacDonald
f5e2e150b8
allow overriding default generate options
2023-07-10 20:58:02 +02:00
Jeffrey Morgan
74e92d1258
add basic /
route for server
2023-07-07 23:46:15 -04:00
Bruce MacDonald
f533f85d44
pr feedback
...
- move error check to api client pull
- simplify error check in generate
- return nil on any pull error
2023-07-07 17:12:02 -04:00
Bruce MacDonald
61dd87bd90
if directory cannot be resolved, do not fail
2023-07-07 15:27:43 -04:00
Patrick Devine
3f1b7177f2
pass model and predict options
2023-07-07 09:34:05 -07:00
Michael Yang
b0618a466e
generate progress
2023-07-06 17:07:40 -07:00
Michael Yang
15c114decb
fix prompt templates
2023-07-06 17:03:18 -07:00
Michael Yang
0637632258
simple pull response
2023-07-06 16:34:44 -04:00
Michael Yang
dd960d1d5e
update generate response
2023-07-06 16:34:44 -04:00
Michael Yang
9b8a456c7d
embed templates
2023-07-06 16:34:44 -04:00
Bruce MacDonald
7cf5905063
display pull progress
2023-07-06 16:34:44 -04:00
Michael Yang
580fe8951c
free llama model
2023-07-06 16:34:44 -04:00
Michael Yang
68e6b4550c
use prompt templates
2023-07-06 16:34:44 -04:00
Bruce MacDonald
a6494f8211
pull models
2023-07-06 16:34:44 -04:00
Michael Yang
1b7183c5a1
enable metal gpu acceleration
...
ggml-metal.metal must be in the same directory as the ollama binary
otherwise llama.cpp will not be able to find it and load it.
1. go generate llama/llama_metal.go
2. go build .
3. ./ollama serve
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
0998d4f0a4
remove debug print statements
2023-07-06 16:34:44 -04:00
Bruce MacDonald
8ea5e5e147
separate routes
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
fd962a36e5
client updates
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
9164981d72
move prompt templates out of python bindings
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
6093a88c1a
add llama.cpp go bindings
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
76cb60d496
wip go engine
...
Co-authored-by: Patrick Devine <pdevine@sonic.net>
2023-07-06 16:34:44 -04:00