baalajimaestro/llama.cpp

Fork 0

Commit graph

Author	SHA1	Message	Date
Phil H	76aafa6149	Implement GGUF metadata KV overrides (#1011 ) * Implement GGUF metadata overrides * whitespace fix * Fix kv overrides. * Fix pointer and pickle * Match llama.cpp kv_overrides cli argument --------- Co-authored-by: Andrei <abetlen@gmail.com>	2024-01-15 12:29:29 -05:00
Andrei Betlen	522aecb868	docs: add server config docs	2023-12-22 14:37:24 -05:00
Dave	12b7f2f4e9	[Feat] Multi model support (#931 ) * Update Llama class to handle chat_format & caching * Add settings.py * Add util.py & update __main__.py * multimodel * update settings.py * cleanup * delete util.py * Fix /v1/models endpoint * MultiLlama now iterable, app check-alive on "/" * instant model init if file is given * backward compability * revert model param mandatory * fix error * handle individual model config json * refactor * revert chathandler/clip_model changes * handle chat_handler in MulitLlama() * split settings into server/llama * reduce global vars * Update LlamaProxy to handle config files * Add free method to LlamaProxy * update arg parsers & install server alias * refactor cache settings * change server executable name * better var name * whitespace * Revert "whitespace" This reverts commit bc5cf51c64a95bfc9926e1bc58166059711a1cd8. * remove exe_name * Fix merge bugs * Fix type annotations * Fix type annotations * Fix uvicorn app factory * Fix settings * Refactor server * Remove formatting fix * Format * Use default model if not found in model settings * Fix * Cleanup * Fix * Fix * Remove unnused CommandLineSettings * Cleanup * Support default name for copilot-codex models --------- Co-authored-by: Andrei Betlen <abetlen@gmail.com>	2023-12-22 05:51:25 -05:00

Author

SHA1

Message

Date

Phil H

76aafa6149

Implement GGUF metadata KV overrides (#1011 )

* Implement GGUF metadata overrides

* whitespace fix

* Fix kv overrides.

* Fix pointer and pickle

* Match llama.cpp kv_overrides cli argument

---------

Co-authored-by: Andrei <abetlen@gmail.com>

2024-01-15 12:29:29 -05:00

Andrei Betlen

522aecb868

docs: add server config docs

2023-12-22 14:37:24 -05:00

Dave

12b7f2f4e9

[Feat] Multi model support (#931 )

* Update Llama class to handle chat_format & caching

* Add settings.py

* Add util.py & update __main__.py

* multimodel

* update settings.py

* cleanup

* delete util.py

* Fix /v1/models endpoint

* MultiLlama now iterable, app check-alive on "/"

* instant model init if file is given

* backward compability

* revert model param mandatory

* fix error

* handle individual model config json

* refactor

* revert chathandler/clip_model changes

* handle chat_handler in MulitLlama()

* split settings into server/llama

* reduce global vars

* Update LlamaProxy to handle config files

* Add free method to LlamaProxy

* update arg parsers & install server alias

* refactor cache settings

* change server executable name

* better var name

* whitespace

* Revert "whitespace"

This reverts commit bc5cf51c64a95bfc9926e1bc58166059711a1cd8.

* remove exe_name

* Fix merge bugs

* Fix type annotations

* Fix type annotations

* Fix uvicorn app factory

* Fix settings

* Refactor server

* Remove formatting fix

* Format

* Use default model if not found in model settings

* Fix

* Cleanup

* Fix

* Fix

* Remove unnused CommandLineSettings

* Cleanup

* Support default name for copilot-codex models

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>

2023-12-22 05:51:25 -05:00

3 commits