Commit graph

108 commits

Author SHA1 Message Date
Lucas Doyle
a5aa6c1478 llama_cpp server: add missing top_k param to CreateChatCompletionRequest
`llama.create_chat_completion` definitely has a `top_k` argument, but its missing from `CreateChatCompletionRequest`. decision: add it
2023-05-01 15:38:19 -07:00
Lucas Doyle
1e42913599 llama_cpp server: move logprobs to supported
I think this is actually supported (its in the arguments of `LLama.__call__`, which is how the completion is invoked). decision: mark as supported
2023-05-01 15:38:19 -07:00
Lucas Doyle
b47b9549d5 llama_cpp server: delete some ignored / unused parameters
`n`, `presence_penalty`, `frequency_penalty`, `best_of`, `logit_bias`, `user`: not supported, excluded from the calls into llama. decision: delete it
2023-05-01 15:38:19 -07:00
Lucas Doyle
e40fcb0575 llama_cpp server: mark model as required
`model` is ignored, but currently marked "optional"... on the one hand could mark "required" to make it explicit in case the server supports multiple llama's at the same time, but also could delete it since its ignored. decision: mark it required for the sake of openai api compatibility.

I think out of all parameters, `model` is probably the most important one for people to keep using even if its ignored for now.
2023-05-01 15:38:19 -07:00
Andrei Betlen
9ff9cdd7fc Fix import error 2023-05-01 15:11:15 -04:00
Lucas Doyle
efe8e6f879 llama_cpp server: slight refactor to init_llama function
Define an init_llama function that starts llama with supplied settings instead of just doing it in the global context of app.py

This allows the test to be less brittle by not needing to mess with os.environ, then importing the app
2023-04-29 11:42:23 -07:00
Lucas Doyle
6d8db9d017 tests: simple test for server module 2023-04-29 11:42:20 -07:00
Lucas Doyle
468377b0e2 llama_cpp server: app is now importable, still runnable as a module 2023-04-29 11:41:25 -07:00