llama.cpp

History

Andrei b99e758045 Merge pull request #604 from aliencaocao/main-1 Add doc string for n_gpu_layers argument and make -1 offload all layers		2023-08-14 22:40:10 -04:00
..
server	Add mul_mat_q option	2023-08-08 14:35:06 -04:00
__init__.py	Black formatting	2023-03-24 14:59:29 -04:00
llama.py	make n_gpu_layers=-1 offload all layers	2023-08-13 11:21:28 +08:00
llama_cpp.py	Update llama.cpp	2023-08-14 22:33:30 -04:00
llama_grammar.py	prevent memory access error by llama_grammar_free	2023-08-07 17:02:33 +09:00
llama_types.py	bugfix: fix compatibility bug with openai api on last token	2023-07-08 00:06:11 -04:00
utils.py	Suppress llama.cpp output when loading model.	2023-07-28 14:45:18 -04:00