llama.cpp/llama_cpp
bretello 39978ccaf5
add mul_mat_q parameter
This also fixes a crash when loading the 70b llama2 model on MacOS with
metal and `n_gpu_layers=1`
2023-08-03 18:24:50 +02:00
..
server Add temporary rms_norm_eps parameter 2023-07-24 14:09:24 -04:00
__init__.py Black formatting 2023-03-24 14:59:29 -04:00
llama.py Change tensor_split from array to pointer 2023-07-25 18:29:59 +10:00
llama_cpp.py add mul_mat_q parameter 2023-08-03 18:24:50 +02:00
llama_types.py bugfix: fix compatibility bug with openai api on last token 2023-07-08 00:06:11 -04:00