llama.cpp/llama_cpp/server
swg 4b01a873ef
server: Support none defaulting to infinity for completions (#111)
* Support defaulting to infinity or -1 for chat completions

* Check if completion_tokens is none in error handler.

* fix: max_tokens in create completion should match openai spec

* Fix __call__

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2023-12-22 14:05:13 -05:00
..
__init__.py llama_cpp server: app is now importable, still runnable as a module 2023-04-29 11:41:25 -07:00
__main__.py [Feat] Multi model support (#931) 2023-12-22 05:51:25 -05:00
app.py [Feat] Multi model support (#931) 2023-12-22 05:51:25 -05:00
cli.py [Feat] Multi model support (#931) 2023-12-22 05:51:25 -05:00
errors.py server: Support none defaulting to infinity for completions (#111) 2023-12-22 14:05:13 -05:00
model.py [Feat] Multi model support (#931) 2023-12-22 05:51:25 -05:00
settings.py [Feat] Multi model support (#931) 2023-12-22 05:51:25 -05:00
types.py server: Support none defaulting to infinity for completions (#111) 2023-12-22 14:05:13 -05:00