Andrei Betlen
|
48c3b77e6f
|
Offload KQV by default
|
2024-01-18 11:08:57 -05:00 |
|
Kyle Mistele
|
9c36688b33
|
fix(cli): allow passing n_ctx=0 to openAI API server args to use model n_ctx_train field per #1015 (#1093)
|
2024-01-16 18:54:06 -05:00 |
|
Andrei Betlen
|
84615adbc6
|
Add split_mode option. Closes #1085
|
2024-01-15 12:49:20 -05:00 |
|
Phil H
|
76aafa6149
|
Implement GGUF metadata KV overrides (#1011)
* Implement GGUF metadata overrides
* whitespace fix
* Fix kv overrides.
* Fix pointer and pickle
* Match llama.cpp kv_overrides cli argument
---------
Co-authored-by: Andrei <abetlen@gmail.com>
|
2024-01-15 12:29:29 -05:00 |
|
Andrei Betlen
|
522aecb868
|
docs: add server config docs
|
2023-12-22 14:37:24 -05:00 |
|
Dave
|
12b7f2f4e9
|
[Feat] Multi model support (#931)
* Update Llama class to handle chat_format & caching
* Add settings.py
* Add util.py & update __main__.py
* multimodel
* update settings.py
* cleanup
* delete util.py
* Fix /v1/models endpoint
* MultiLlama now iterable, app check-alive on "/"
* instant model init if file is given
* backward compability
* revert model param mandatory
* fix error
* handle individual model config json
* refactor
* revert chathandler/clip_model changes
* handle chat_handler in MulitLlama()
* split settings into server/llama
* reduce global vars
* Update LlamaProxy to handle config files
* Add free method to LlamaProxy
* update arg parsers & install server alias
* refactor cache settings
* change server executable name
* better var name
* whitespace
* Revert "whitespace"
This reverts commit bc5cf51c64a95bfc9926e1bc58166059711a1cd8.
* remove exe_name
* Fix merge bugs
* Fix type annotations
* Fix type annotations
* Fix uvicorn app factory
* Fix settings
* Refactor server
* Remove formatting fix
* Format
* Use default model if not found in model settings
* Fix
* Cleanup
* Fix
* Fix
* Remove unnused CommandLineSettings
* Cleanup
* Support default name for copilot-codex models
---------
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
|
2023-12-22 05:51:25 -05:00 |
|