llama.cpp

History

swg 4b01a873ef server: Support none defaulting to infinity for completions (#111 ) * Support defaulting to infinity or -1 for chat completions * Check if completion_tokens is none in error handler. * fix: max_tokens in create completion should match openai spec * Fix __call__ --------- Co-authored-by: Andrei Betlen <abetlen@gmail.com>		2023-12-22 14:05:13 -05:00
..
server	server: Support none defaulting to infinity for completions (#111 )	2023-12-22 14:05:13 -05:00
__init__.py	Bump version	2023-12-18 16:09:18 -05:00
_utils.py	Fix UnsupportedOperation: fileno in suppress_stdout_stderr (#961 )	2023-12-11 20:44:51 -05:00
llama.py	server: Support none defaulting to infinity for completions (#111 )	2023-12-22 14:05:13 -05:00
llama_chat_format.py	Add qwen chat format (#1005 )	2023-12-13 21:43:43 -05:00
llama_cpp.py	Update llama.cpp	2023-12-22 00:12:37 -05:00
llama_grammar.py	Add from_json_schema to LlamaGrammar	2023-11-23 00:27:00 -05:00
llama_types.py	Add missing tool_calls finish_reason	2023-11-10 02:51:06 -05:00
llava_cpp.py	Make building llava optional	2023-11-28 04:55:21 -05:00
py.typed	Add py.typed	2023-08-11 09:58:48 +02:00