llama.cpp

History

MillionthOdin16 c283edd7f2 Set n_batch to default values and reduce thread count: Change batch size to the llama.cpp default of 8. I've seen issues in llama.cpp where batch size affects quality of generations. (It shouldn't) But in case that's still an issue I changed to default. Set auto-determined num of threads to 1/2 system count. ggml will sometimes lock cores at 100% while doing nothing. This is being addressed, but can cause bad experience for user if pegged at 100%		2023-04-05 18:17:29 -04:00
..
server	Set n_batch to default values and reduce thread count:	2023-04-05 18:17:29 -04:00
__init__.py	Black formatting	2023-03-24 14:59:29 -04:00
llama.py	Make Llama instance pickleable. Closes #27	2023-04-05 06:52:17 -04:00
llama_cpp.py	Bugfix: wrong signature for quantize function	2023-04-04 22:36:59 -04:00
llama_types.py	Bugfix for Python3.7	2023-04-05 04:37:33 -04:00