Commit graph

549 commits

Author SHA1 Message Date
Lucas Doyle
6d8db9d017 tests: simple test for server module 2023-04-29 11:42:20 -07:00
Lucas Doyle
468377b0e2 llama_cpp server: app is now importable, still runnable as a module 2023-04-29 11:41:25 -07:00
Andrei
755f9fa455
Merge pull request #118 from SagsMug/main
Fix UnicodeDecodeError permanently
2023-04-29 07:19:01 -04:00
Mug
18a0c10032 Remove excessive errors="ignore" and add utf8 test 2023-04-29 12:19:22 +02:00
Andrei Betlen
523825e91d Update README 2023-04-28 17:12:03 -04:00
Andrei Betlen
e00beb13b5 Update README 2023-04-28 17:08:18 -04:00
Andrei Betlen
5423d047c7 Bump version 2023-04-28 15:33:08 -04:00
Andrei Betlen
ea0faabae1 Update llama.cpp 2023-04-28 15:32:43 -04:00
Mug
b7d14efc8b Python weirdness 2023-04-28 13:20:31 +02:00
Mug
eed61289b6 Dont detect off tokens, detect off detokenized utf8 2023-04-28 13:16:18 +02:00
Mug
3a98747026 One day, i'll fix off by 1 errors permanently too 2023-04-28 12:54:28 +02:00
Mug
c39547a986 Detect multi-byte responses and wait 2023-04-28 12:50:30 +02:00
Andrei Betlen
9339929f56 Update llama.cpp 2023-04-26 20:00:54 -04:00
Mug
5f81400fcb Also ignore errors on input prompts 2023-04-26 14:45:51 +02:00
Mug
3c130f00ca Remove try catch from chat 2023-04-26 14:38:53 +02:00
Mug
be2c961bc9 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python 2023-04-26 14:38:09 +02:00
Mug
c4a8491d42 Fix decode errors permanently 2023-04-26 14:37:06 +02:00
Andrei Betlen
cbd26fdcc1 Update llama.cpp 2023-04-25 19:03:41 -04:00
Andrei Betlen
3cab3ef4cb Update n_batch for server 2023-04-25 09:11:32 -04:00
Andrei Betlen
cc706fb944 Add ctx check and re-order __init__. Closes #112 2023-04-25 09:00:53 -04:00
Andrei Betlen
996e31d861 Bump version 2023-04-25 01:37:07 -04:00
Andrei Betlen
848c83dfd0 Add FORCE_CMAKE option 2023-04-25 01:36:37 -04:00
Andrei Betlen
9dddb3a607 Bump version 2023-04-25 00:19:44 -04:00
Andrei Betlen
d484c5634e Bugfix: Check cache keys as prefix to prompt tokens 2023-04-24 22:18:54 -04:00
Andrei Betlen
b75fa96bf7 Update docs 2023-04-24 19:56:57 -04:00
Andrei Betlen
cbe95bbb75 Add cache implementation using llama state 2023-04-24 19:54:41 -04:00
Andrei Betlen
2c359a28ff Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-04-24 17:51:27 -04:00
Andrei Betlen
197cf80601 Add save/load state api for Llama class 2023-04-24 17:51:25 -04:00
Andrei Betlen
c4c332fc51 Update llama.cpp 2023-04-24 17:42:09 -04:00
Andrei Betlen
280a047dd6 Update llama.cpp 2023-04-24 15:52:24 -04:00
Andrei Betlen
86f8e5ad91 Refactor internal state for Llama class 2023-04-24 15:47:54 -04:00
Andrei
f37456133a
Merge pull request #108 from eiery/main
Update n_batch default to 512 to match upstream llama.cpp
2023-04-24 13:48:09 -04:00
Andrei Betlen
02cf881317 Update llama.cpp 2023-04-24 09:30:10 -04:00
Niek van der Maas
8476b325f1
Change to bullseye 2023-04-24 09:54:38 +02:00
eiery
aa12d8a81f
Update llama.py
update n_batch default to 512 to match upstream llama.cpp
2023-04-23 20:56:40 -04:00
Andrei Betlen
7230599593 Disable mmap when applying lora weights. Closes #107 2023-04-23 14:53:17 -04:00
Andrei Betlen
e99caedbbd Update llama.cpp 2023-04-22 19:50:28 -04:00
Andrei Betlen
643b73e155 Bump version 2023-04-21 19:38:54 -04:00
Andrei Betlen
1eb130a6b2 Update llama.cpp 2023-04-21 17:40:27 -04:00
Andrei Betlen
ba3959eafd Update llama.cpp 2023-04-20 05:15:31 -04:00
Andrei Betlen
207adbdf13 Bump version 2023-04-20 01:48:24 -04:00
Andrei Betlen
3d290623f5 Update llama.cpp 2023-04-20 01:08:15 -04:00
Andrei Betlen
e4647c75ec Add use_mmap flag to server 2023-04-19 15:57:46 -04:00
Andrei Betlen
207ebbc8dc Update llama.cpp 2023-04-19 14:02:11 -04:00
Andrei Betlen
0df4d69c20 If lora base is not set avoid re-loading the model by passing NULL 2023-04-18 23:45:25 -04:00
Andrei Betlen
95c0dc134e Update type signature to allow for null pointer to be passed. 2023-04-18 23:44:46 -04:00
Andrei Betlen
453e517fd5 Add seperate lora_base path for applying LoRA to quantized models using original unquantized model weights. 2023-04-18 10:20:46 -04:00
Andrei Betlen
32ca803bd8 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-04-18 02:22:39 -04:00
Andrei Betlen
b2d44aa633 Update llama.cpp 2023-04-18 02:22:35 -04:00
Andrei
4ce6670bbd
Merge pull request #87 from SagsMug/main
Fix TypeError in low_level chat
2023-04-18 02:11:40 -04:00