Andrei Betlen
dc39cc0fa4
Use server sent events function for streaming completion
2023-05-19 02:04:30 -04:00
Andrei Betlen
a3352923c7
Add model_alias option to override model_path in completions. Closes #39
2023-05-16 17:22:00 -04:00
Andrei Betlen
cdf59768f5
Update llama.cpp
2023-05-14 00:04:22 -04:00
Andrei Betlen
8740ddc58e
Only support generating one prompt at a time.
2023-05-12 07:21:46 -04:00
Andrei Betlen
8895b9002a
Revert "llama_cpp server: prompt is a string". Closes #187
...
This reverts commit b9098b0ef7
.
2023-05-12 07:16:57 -04:00
Lucas Doyle
02e8a018ae
llama_cpp server: document presence_penalty and frequency_penalty, mark as supported
2023-05-09 16:25:00 -07:00
Andrei Betlen
82d138fe54
Fix: default repeat_penalty
2023-05-08 18:49:11 -04:00
Andrei Betlen
0d751a69a7
Set repeat_penalty to 0 by default
2023-05-08 01:50:43 -04:00
Andrei Betlen
65d9cc050c
Add openai frequency and presence penalty parameters. Closes #169
2023-05-08 01:30:18 -04:00
Andrei Betlen
a0b61ea2a7
Bugfix for models endpoint
2023-05-07 20:17:52 -04:00
Andrei Betlen
14da46f16e
Added cache size to settins object.
2023-05-07 19:33:17 -04:00
Andrei Betlen
627811ea83
Add verbose flag to server
2023-05-07 05:09:10 -04:00
Andrei Betlen
3fbda71790
Fix mlock_supported and mmap_supported return type
2023-05-07 03:04:22 -04:00
Andrei Betlen
5a3413eee3
Update cpu_count
2023-05-07 03:03:57 -04:00
Andrei Betlen
1a00e452ea
Update settings fields and defaults
2023-05-07 02:52:20 -04:00
Andrei Betlen
86753976c4
Revert "llama_cpp server: delete some ignored / unused parameters"
...
This reverts commit b47b9549d5
.
2023-05-07 02:02:34 -04:00
Andrei Betlen
c382d8f86a
Revert "llama_cpp server: mark model as required"
...
This reverts commit e40fcb0575
.
2023-05-07 02:00:22 -04:00
Lucas Doyle
b9098b0ef7
llama_cpp server: prompt is a string
...
Not sure why this union type was here but taking a look at llama.py, prompt is only ever processed as a string for completion
This was breaking types when generating an openapi client
2023-05-02 14:47:07 -07:00
Andrei
7ab08b8d10
Merge branch 'main' into better-server-params-and-fields
2023-05-01 22:45:57 -04:00
Andrei Betlen
9eafc4c49a
Refactor server to use factory
2023-05-01 22:38:46 -04:00
Lucas Doyle
dbbfc4ba2f
llama_cpp server: fix to ChatCompletionRequestMessage
...
When I generate a client, it breaks because it fails to process the schema of ChatCompletionRequestMessage
These fix that:
- I think `Union[Literal["user"], Literal["channel"], ...]` is the same as Literal["user", "channel", ...]
- Turns out default value `Literal["user"]` isn't JSON serializable, so replace with "user"
2023-05-01 15:38:19 -07:00
Lucas Doyle
fa2a61e065
llama_cpp server: fields for the embedding endpoint
2023-05-01 15:38:19 -07:00
Lucas Doyle
8dcbf65a45
llama_cpp server: define fields for chat completions
...
Slight refactor for common fields shared between completion and chat completion
2023-05-01 15:38:19 -07:00
Lucas Doyle
978b6daf93
llama_cpp server: add some more information to fields for completions
2023-05-01 15:38:19 -07:00
Lucas Doyle
a5aa6c1478
llama_cpp server: add missing top_k param to CreateChatCompletionRequest
...
`llama.create_chat_completion` definitely has a `top_k` argument, but its missing from `CreateChatCompletionRequest`. decision: add it
2023-05-01 15:38:19 -07:00
Lucas Doyle
1e42913599
llama_cpp server: move logprobs to supported
...
I think this is actually supported (its in the arguments of `LLama.__call__`, which is how the completion is invoked). decision: mark as supported
2023-05-01 15:38:19 -07:00
Lucas Doyle
b47b9549d5
llama_cpp server: delete some ignored / unused parameters
...
`n`, `presence_penalty`, `frequency_penalty`, `best_of`, `logit_bias`, `user`: not supported, excluded from the calls into llama. decision: delete it
2023-05-01 15:38:19 -07:00
Lucas Doyle
e40fcb0575
llama_cpp server: mark model as required
...
`model` is ignored, but currently marked "optional"... on the one hand could mark "required" to make it explicit in case the server supports multiple llama's at the same time, but also could delete it since its ignored. decision: mark it required for the sake of openai api compatibility.
I think out of all parameters, `model` is probably the most important one for people to keep using even if its ignored for now.
2023-05-01 15:38:19 -07:00
Andrei Betlen
9ff9cdd7fc
Fix import error
2023-05-01 15:11:15 -04:00
Lucas Doyle
efe8e6f879
llama_cpp server: slight refactor to init_llama function
...
Define an init_llama function that starts llama with supplied settings instead of just doing it in the global context of app.py
This allows the test to be less brittle by not needing to mess with os.environ, then importing the app
2023-04-29 11:42:23 -07:00
Lucas Doyle
6d8db9d017
tests: simple test for server module
2023-04-29 11:42:20 -07:00
Lucas Doyle
468377b0e2
llama_cpp server: app is now importable, still runnable as a module
2023-04-29 11:41:25 -07:00