llama.cpp/llama_cpp
Dave 12b7f2f4e9
[Feat] Multi model support (#931)
* Update Llama class to handle chat_format & caching

* Add settings.py

* Add util.py & update __main__.py

* multimodel

* update settings.py

* cleanup

* delete util.py

* Fix /v1/models endpoint

* MultiLlama now iterable, app check-alive on "/"

* instant model init if file is given

* backward compability

* revert model param mandatory

* fix error

* handle individual model config json

* refactor

* revert chathandler/clip_model changes

* handle chat_handler in MulitLlama()

* split settings into server/llama

* reduce global vars

* Update LlamaProxy to handle config files

* Add free method to LlamaProxy

* update arg parsers & install server alias

* refactor cache settings

* change server executable name

* better var name

* whitespace

* Revert "whitespace"

This reverts commit bc5cf51c64a95bfc9926e1bc58166059711a1cd8.

* remove exe_name

* Fix merge bugs

* Fix type annotations

* Fix type annotations

* Fix uvicorn app factory

* Fix settings

* Refactor server

* Remove formatting fix

* Format

* Use default model if not found in model settings

* Fix

* Cleanup

* Fix

* Fix

* Remove unnused CommandLineSettings

* Cleanup

* Support default name for copilot-codex models

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2023-12-22 05:51:25 -05:00
..
server [Feat] Multi model support (#931) 2023-12-22 05:51:25 -05:00
__init__.py Bump version 2023-12-18 16:09:18 -05:00
_utils.py Fix UnsupportedOperation: fileno in suppress_stdout_stderr (#961) 2023-12-11 20:44:51 -05:00
llama.py fix text_offset of multi-token characters (#1037) 2023-12-22 00:03:29 -05:00
llama_chat_format.py Add qwen chat format (#1005) 2023-12-13 21:43:43 -05:00
llama_cpp.py Update llama.cpp 2023-12-22 00:12:37 -05:00
llama_grammar.py Add from_json_schema to LlamaGrammar 2023-11-23 00:27:00 -05:00
llama_types.py Add missing tool_calls finish_reason 2023-11-10 02:51:06 -05:00
llava_cpp.py Make building llava optional 2023-11-28 04:55:21 -05:00
py.typed Add py.typed 2023-08-11 09:58:48 +02:00