f1c631dc53
If the n_ctx is set to 0 the code should use the maximum context length of the selected model, but it didn't work. There was a problem with the initialization of this parameter and a related problem with 'n_batch'. |
||
---|---|---|
.. | ||
server | ||
__init__.py | ||
_utils.py | ||
llama.py | ||
llama_chat_format.py | ||
llama_cpp.py | ||
llama_grammar.py | ||
llama_types.py | ||
llava_cpp.py | ||
py.typed |