Andrei Betlen
|
dd9ad1c759
|
Formatting
|
2023-05-01 21:51:16 -04:00 |
|
Andrei Betlen
|
b6747f722e
|
Fix logprob calculation. Fixes #134
|
2023-05-01 17:45:08 -04:00 |
|
Andrei Betlen
|
350a1769e1
|
Update sampling api
|
2023-05-01 14:47:55 -04:00 |
|
Mug
|
18a0c10032
|
Remove excessive errors="ignore" and add utf8 test
|
2023-04-29 12:19:22 +02:00 |
|
Mug
|
b7d14efc8b
|
Python weirdness
|
2023-04-28 13:20:31 +02:00 |
|
Mug
|
eed61289b6
|
Dont detect off tokens, detect off detokenized utf8
|
2023-04-28 13:16:18 +02:00 |
|
Mug
|
3a98747026
|
One day, i'll fix off by 1 errors permanently too
|
2023-04-28 12:54:28 +02:00 |
|
Mug
|
c39547a986
|
Detect multi-byte responses and wait
|
2023-04-28 12:50:30 +02:00 |
|
Mug
|
5f81400fcb
|
Also ignore errors on input prompts
|
2023-04-26 14:45:51 +02:00 |
|
Mug
|
be2c961bc9
|
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python
|
2023-04-26 14:38:09 +02:00 |
|
Mug
|
c4a8491d42
|
Fix decode errors permanently
|
2023-04-26 14:37:06 +02:00 |
|
Andrei Betlen
|
cc706fb944
|
Add ctx check and re-order __init__. Closes #112
|
2023-04-25 09:00:53 -04:00 |
|
Andrei Betlen
|
d484c5634e
|
Bugfix: Check cache keys as prefix to prompt tokens
|
2023-04-24 22:18:54 -04:00 |
|
Andrei Betlen
|
cbe95bbb75
|
Add cache implementation using llama state
|
2023-04-24 19:54:41 -04:00 |
|
Andrei Betlen
|
2c359a28ff
|
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
|
2023-04-24 17:51:27 -04:00 |
|
Andrei Betlen
|
197cf80601
|
Add save/load state api for Llama class
|
2023-04-24 17:51:25 -04:00 |
|
Andrei Betlen
|
86f8e5ad91
|
Refactor internal state for Llama class
|
2023-04-24 15:47:54 -04:00 |
|
Andrei
|
f37456133a
|
Merge pull request #108 from eiery/main
Update n_batch default to 512 to match upstream llama.cpp
|
2023-04-24 13:48:09 -04:00 |
|
eiery
|
aa12d8a81f
|
Update llama.py
update n_batch default to 512 to match upstream llama.cpp
|
2023-04-23 20:56:40 -04:00 |
|
Andrei Betlen
|
7230599593
|
Disable mmap when applying lora weights. Closes #107
|
2023-04-23 14:53:17 -04:00 |
|
Andrei Betlen
|
0df4d69c20
|
If lora base is not set avoid re-loading the model by passing NULL
|
2023-04-18 23:45:25 -04:00 |
|
Andrei Betlen
|
453e517fd5
|
Add seperate lora_base path for applying LoRA to quantized models using original unquantized model weights.
|
2023-04-18 10:20:46 -04:00 |
|
Andrei Betlen
|
eb7f278cc6
|
Add lora_path parameter to Llama model
|
2023-04-18 01:43:44 -04:00 |
|
Andrei Betlen
|
89856ef00d
|
Bugfix: only eval new tokens
|
2023-04-15 17:32:53 -04:00 |
|
Andrei Betlen
|
92c077136d
|
Add experimental cache
|
2023-04-15 12:03:09 -04:00 |
|
Andrei Betlen
|
a6372a7ae5
|
Update stop sequences for chat
|
2023-04-15 12:02:48 -04:00 |
|
Andrei Betlen
|
83b2be6dc4
|
Update chat parameters
|
2023-04-15 11:58:43 -04:00 |
|
Andrei Betlen
|
62087514c6
|
Update chat prompt
|
2023-04-15 11:58:19 -04:00 |
|
Andrei Betlen
|
02f9fb82fb
|
Bugfix
|
2023-04-15 11:39:52 -04:00 |
|
Andrei Betlen
|
3cd67c7bd7
|
Add type annotations
|
2023-04-15 11:39:21 -04:00 |
|
Andrei Betlen
|
d7de0e8014
|
Bugfix
|
2023-04-15 00:08:04 -04:00 |
|
Andrei Betlen
|
e90e122f2a
|
Use clear
|
2023-04-14 23:33:18 -04:00 |
|
Andrei Betlen
|
ac7068a469
|
Track generated tokens internally
|
2023-04-14 23:33:00 -04:00 |
|
Andrei Betlen
|
6e298d8fca
|
Set kv cache size to f16 by default
|
2023-04-14 22:21:19 -04:00 |
|
Andrei Betlen
|
6153baab2d
|
Clean up logprobs implementation
|
2023-04-14 09:59:33 -04:00 |
|
Andrei Betlen
|
26cc4ee029
|
Fix signature for stop parameter
|
2023-04-14 09:59:08 -04:00 |
|
Andrei Betlen
|
6595ad84bf
|
Add field to disable reseting between generations
|
2023-04-13 00:28:00 -04:00 |
|
Andrei Betlen
|
22fa5a621f
|
Revert "Deprecate generate method"
This reverts commit 6cf5876538 .
|
2023-04-13 00:19:55 -04:00 |
|
Andrei Betlen
|
c854c2564b
|
Don't serialize stateful parameters
|
2023-04-12 14:07:14 -04:00 |
|
Andrei Betlen
|
2f9b649005
|
Style fix
|
2023-04-12 14:06:22 -04:00 |
|
Andrei Betlen
|
6cf5876538
|
Deprecate generate method
|
2023-04-12 14:06:04 -04:00 |
|
Andrei Betlen
|
b3805bb9cc
|
Implement logprobs parameter for text completion. Closes #2
|
2023-04-12 14:05:11 -04:00 |
|
Andrei Betlen
|
1f67ad2a0b
|
Add use_mmap option
|
2023-04-10 02:11:35 -04:00 |
|
Andrei Betlen
|
314ce7d1cc
|
Fix cpu count default
|
2023-04-08 19:54:04 -04:00 |
|
Andrei Betlen
|
3fbc06361f
|
Formatting
|
2023-04-08 16:01:45 -04:00 |
|
Andrei Betlen
|
e96a5c5722
|
Make Llama instance pickleable. Closes #27
|
2023-04-05 06:52:17 -04:00 |
|
Andrei Betlen
|
cefc69ea43
|
Add runtime check to ensure embedding is enabled if trying to generate embeddings
|
2023-04-05 03:25:37 -04:00 |
|
Andrei Betlen
|
5c50af7462
|
Remove workaround
|
2023-04-05 03:25:09 -04:00 |
|
Andrei Betlen
|
c137789143
|
Add verbose flag. Closes #19
|
2023-04-04 13:09:24 -04:00 |
|
Andrei Betlen
|
5075c16fcc
|
Bugfix: n_batch should always be <= n_ctx
|
2023-04-04 13:08:21 -04:00 |
|