llama.cpp/llama_cpp/server
twaka 5212fb08ae
feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333)
* implement min_tokens

* set default to 0

* pass min_tokens

* fix

* remove copy

* implement MinTokensLogitsProcessor

* format

* fix condition
2024-05-14 09:50:53 -04:00
..
__init__.py llama_cpp server: app is now importable, still runnable as a module 2023-04-29 11:41:25 -07:00
__main__.py feat: Add support for yaml based configs 2024-04-10 02:47:01 -04:00
app.py feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333) 2024-05-14 09:50:53 -04:00
cli.py Fix python3.8 support 2024-01-19 08:17:49 -05:00
errors.py misc: Format 2024-02-28 14:27:40 -05:00
model.py fix(server): Propagate flash_attn to model load. (#1424) 2024-05-03 12:17:07 -04:00
settings.py feat(server): Add support for setting root_path. Closes #1420 2024-05-05 12:49:31 -04:00
types.py feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333) 2024-05-14 09:50:53 -04:00