Commit graph

1552 commits

Author SHA1 Message Date
Andrei Betlen
80184a286c Update llama.cpp 2023-05-01 10:44:28 -04:00
Lucas Doyle
efe8e6f879 llama_cpp server: slight refactor to init_llama function
Define an init_llama function that starts llama with supplied settings instead of just doing it in the global context of app.py

This allows the test to be less brittle by not needing to mess with os.environ, then importing the app
2023-04-29 11:42:23 -07:00
Lucas Doyle
6d8db9d017 tests: simple test for server module 2023-04-29 11:42:20 -07:00
Lucas Doyle
468377b0e2 llama_cpp server: app is now importable, still runnable as a module 2023-04-29 11:41:25 -07:00
Andrei
755f9fa455
Merge pull request #118 from SagsMug/main
Fix UnicodeDecodeError permanently
2023-04-29 07:19:01 -04:00
Mug
18a0c10032 Remove excessive errors="ignore" and add utf8 test 2023-04-29 12:19:22 +02:00
Andrei Betlen
523825e91d Update README 2023-04-28 17:12:03 -04:00
Andrei Betlen
e00beb13b5 Update README 2023-04-28 17:08:18 -04:00
Andrei Betlen
5423d047c7 Bump version 2023-04-28 15:33:08 -04:00
Andrei Betlen
ea0faabae1 Update llama.cpp 2023-04-28 15:32:43 -04:00
Mug
b7d14efc8b Python weirdness 2023-04-28 13:20:31 +02:00
Mug
eed61289b6 Dont detect off tokens, detect off detokenized utf8 2023-04-28 13:16:18 +02:00
Mug
3a98747026 One day, i'll fix off by 1 errors permanently too 2023-04-28 12:54:28 +02:00
Mug
c39547a986 Detect multi-byte responses and wait 2023-04-28 12:50:30 +02:00
Andrei Betlen
9339929f56 Update llama.cpp 2023-04-26 20:00:54 -04:00
Mug
5f81400fcb Also ignore errors on input prompts 2023-04-26 14:45:51 +02:00
Mug
3c130f00ca Remove try catch from chat 2023-04-26 14:38:53 +02:00
Mug
be2c961bc9 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python 2023-04-26 14:38:09 +02:00
Mug
c4a8491d42 Fix decode errors permanently 2023-04-26 14:37:06 +02:00
Andrei Betlen
cbd26fdcc1 Update llama.cpp 2023-04-25 19:03:41 -04:00
Andrei Betlen
3cab3ef4cb Update n_batch for server 2023-04-25 09:11:32 -04:00
Andrei Betlen
cc706fb944 Add ctx check and re-order __init__. Closes #112 2023-04-25 09:00:53 -04:00
Andrei Betlen
996e31d861 Bump version 2023-04-25 01:37:07 -04:00
Andrei Betlen
848c83dfd0 Add FORCE_CMAKE option 2023-04-25 01:36:37 -04:00
Andrei Betlen
9dddb3a607 Bump version 2023-04-25 00:19:44 -04:00
Andrei Betlen
d484c5634e Bugfix: Check cache keys as prefix to prompt tokens 2023-04-24 22:18:54 -04:00
Andrei Betlen
b75fa96bf7 Update docs 2023-04-24 19:56:57 -04:00
Andrei Betlen
cbe95bbb75 Add cache implementation using llama state 2023-04-24 19:54:41 -04:00
Andrei Betlen
2c359a28ff Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-04-24 17:51:27 -04:00
Andrei Betlen
197cf80601 Add save/load state api for Llama class 2023-04-24 17:51:25 -04:00
Andrei Betlen
c4c332fc51 Update llama.cpp 2023-04-24 17:42:09 -04:00
Andrei Betlen
280a047dd6 Update llama.cpp 2023-04-24 15:52:24 -04:00
Andrei Betlen
86f8e5ad91 Refactor internal state for Llama class 2023-04-24 15:47:54 -04:00
Andrei
f37456133a
Merge pull request #108 from eiery/main
Update n_batch default to 512 to match upstream llama.cpp
2023-04-24 13:48:09 -04:00
Andrei Betlen
02cf881317 Update llama.cpp 2023-04-24 09:30:10 -04:00
Niek van der Maas
8476b325f1
Change to bullseye 2023-04-24 09:54:38 +02:00
eiery
aa12d8a81f
Update llama.py
update n_batch default to 512 to match upstream llama.cpp
2023-04-23 20:56:40 -04:00
Andrei Betlen
7230599593 Disable mmap when applying lora weights. Closes #107 2023-04-23 14:53:17 -04:00
Andrei Betlen
e99caedbbd Update llama.cpp 2023-04-22 19:50:28 -04:00
Andrei Betlen
643b73e155 Bump version 2023-04-21 19:38:54 -04:00
Andrei Betlen
1eb130a6b2 Update llama.cpp 2023-04-21 17:40:27 -04:00
Andrei Betlen
ba3959eafd Update llama.cpp 2023-04-20 05:15:31 -04:00
Andrei Betlen
207adbdf13 Bump version 2023-04-20 01:48:24 -04:00
Andrei Betlen
3d290623f5 Update llama.cpp 2023-04-20 01:08:15 -04:00
Andrei Betlen
e4647c75ec Add use_mmap flag to server 2023-04-19 15:57:46 -04:00
Andrei Betlen
207ebbc8dc Update llama.cpp 2023-04-19 14:02:11 -04:00
Andrei Betlen
0df4d69c20 If lora base is not set avoid re-loading the model by passing NULL 2023-04-18 23:45:25 -04:00
Andrei Betlen
95c0dc134e Update type signature to allow for null pointer to be passed. 2023-04-18 23:44:46 -04:00
Andrei Betlen
453e517fd5 Add seperate lora_base path for applying LoRA to quantized models using original unquantized model weights. 2023-04-18 10:20:46 -04:00
Andrei Betlen
32ca803bd8 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-04-18 02:22:39 -04:00