baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Lucas Doyle	a5aa6c1478	llama_cpp server: add missing top_k param to CreateChatCompletionRequest `llama.create_chat_completion` definitely has a `top_k` argument, but its missing from `CreateChatCompletionRequest`. decision: add it	2023-05-01 15:38:19 -07:00
Lucas Doyle	1e42913599	llama_cpp server: move logprobs to supported I think this is actually supported (its in the arguments of `LLama.__call__`, which is how the completion is invoked). decision: mark as supported	2023-05-01 15:38:19 -07:00
Lucas Doyle	b47b9549d5	llama_cpp server: delete some ignored / unused parameters `n`, `presence_penalty`, `frequency_penalty`, `best_of`, `logit_bias`, `user`: not supported, excluded from the calls into llama. decision: delete it	2023-05-01 15:38:19 -07:00
Lucas Doyle	e40fcb0575	llama_cpp server: mark model as required `model` is ignored, but currently marked "optional"... on the one hand could mark "required" to make it explicit in case the server supports multiple llama's at the same time, but also could delete it since its ignored. decision: mark it required for the sake of openai api compatibility. I think out of all parameters, `model` is probably the most important one for people to keep using even if its ignored for now.	2023-05-01 15:38:19 -07:00
Andrei Betlen	9ff9cdd7fc	Fix import error	2023-05-01 15:11:15 -04:00
Lucas Doyle	efe8e6f879	llama_cpp server: slight refactor to init_llama function Define an init_llama function that starts llama with supplied settings instead of just doing it in the global context of app.py This allows the test to be less brittle by not needing to mess with os.environ, then importing the app	2023-04-29 11:42:23 -07:00
Lucas Doyle	6d8db9d017	tests: simple test for server module	2023-04-29 11:42:20 -07:00
Lucas Doyle	468377b0e2	llama_cpp server: app is now importable, still runnable as a module	2023-04-29 11:41:25 -07:00

1 2 3

108 commits