llama.cpp/llama_cpp
Andrei b99e758045
Merge pull request #604 from aliencaocao/main-1
Add doc string for n_gpu_layers argument and make -1 offload all layers
2023-08-14 22:40:10 -04:00
..
server Add mul_mat_q option 2023-08-08 14:35:06 -04:00
__init__.py Black formatting 2023-03-24 14:59:29 -04:00
llama.py make n_gpu_layers=-1 offload all layers 2023-08-13 11:21:28 +08:00
llama_cpp.py Update llama.cpp 2023-08-14 22:33:30 -04:00
llama_grammar.py prevent memory access error by llama_grammar_free 2023-08-07 17:02:33 +09:00
llama_types.py bugfix: fix compatibility bug with openai api on last token 2023-07-08 00:06:11 -04:00
utils.py Suppress llama.cpp output when loading model. 2023-07-28 14:45:18 -04:00