Andrei Betlen
|
a7ba85834f
|
Add n_ctx, n_vocab, and n_embd properties
|
2023-05-20 08:13:41 -04:00 |
|
Andrei Betlen
|
01a010be52
|
Fix llama_cpp and Llama type signatures. Closes #221
|
2023-05-19 11:59:33 -04:00 |
|
Andrei Betlen
|
a8cd169251
|
Bugfix: Stop sequences can be strings
|
2023-05-19 03:15:08 -04:00 |
|
Andrei Betlen
|
17d4271b04
|
Fix logprobs for completions and implement for streaming logprobs.
|
2023-05-19 02:20:27 -04:00 |
|
Andrei Betlen
|
f0ec6e615e
|
Stream tokens instead of text chunks
|
2023-05-18 11:35:59 -04:00 |
|
Andrei Betlen
|
4f342795e5
|
Update token checks
|
2023-05-17 03:35:13 -04:00 |
|
Andrei Betlen
|
f5c2f998ab
|
Format
|
2023-05-17 02:00:39 -04:00 |
|
Andrei Betlen
|
d28b753ed2
|
Implement penalize_nl
|
2023-05-17 01:53:26 -04:00 |
|
Andrei Betlen
|
f11e2a781c
|
Fix last_n_tokens_size
|
2023-05-17 01:42:51 -04:00 |
|
Andrei Betlen
|
7e55244540
|
Fix top_k value. Closes #220
|
2023-05-17 01:41:42 -04:00 |
|
Andrei Betlen
|
a7c9e38287
|
Update variable name
|
2023-05-16 18:07:25 -04:00 |
|
Andrei Betlen
|
a3352923c7
|
Add model_alias option to override model_path in completions. Closes #39
|
2023-05-16 17:22:00 -04:00 |
|
Andrei Betlen
|
a65125c0bd
|
Add sampling defaults for generate
|
2023-05-16 09:35:50 -04:00 |
|
Andrei Betlen
|
cdf59768f5
|
Update llama.cpp
|
2023-05-14 00:04:22 -04:00 |
|
Andrei Betlen
|
7a536e86c2
|
Allow model to tokenize strings longer than context length and set add_bos. Closes #92
|
2023-05-12 14:28:22 -04:00 |
|
Andrei Betlen
|
7be584fe82
|
Add missing tfs_z paramter
|
2023-05-11 21:56:19 -04:00 |
|
Andrei Betlen
|
cdeaded251
|
Bugfix: Ensure logs are printed when streaming
|
2023-05-10 16:12:17 -04:00 |
|
Andrei Betlen
|
d957422bf4
|
Implement sampling as in llama.cpp main example
|
2023-05-08 21:21:25 -04:00 |
|
Andrei Betlen
|
93a9019bb1
|
Merge branch 'main' of github.com:abetlen/llama_cpp_python into Maximilian-Winter/main
|
2023-05-08 19:57:09 -04:00 |
|
Andrei Betlen
|
65d9cc050c
|
Add openai frequency and presence penalty parameters. Closes #169
|
2023-05-08 01:30:18 -04:00 |
|
Andrei Betlen
|
e72f58614b
|
Change pointer to lower overhead byref
|
2023-05-07 20:01:34 -04:00 |
|
Andrei Betlen
|
0e94a70de1
|
Add in-memory longest prefix cache. Closes #158
|
2023-05-07 19:31:26 -04:00 |
|
Andrei Betlen
|
2753b85321
|
Format
|
2023-05-07 13:19:56 -04:00 |
|
Andrei Betlen
|
7c3743fe5f
|
Update llama.cpp
|
2023-05-07 00:12:47 -04:00 |
|
Andrei Betlen
|
bc853e3742
|
Fix type for eval_logits in LlamaState object
|
2023-05-06 21:32:50 -04:00 |
|
Maximilian Winter
|
515d9bde7e
|
Fixed somethings and activated cublas
|
2023-05-06 23:40:19 +02:00 |
|
Maximilian Winter
|
aa203a0d65
|
Added mirostat sampling to the high level API.
|
2023-05-06 22:47:47 +02:00 |
|
Andrei Betlen
|
98bbd1c6a8
|
Fix eval logits type
|
2023-05-05 14:23:14 -04:00 |
|
Andrei Betlen
|
66e28eb548
|
Fix temperature bug
|
2023-05-05 14:00:41 -04:00 |
|
Andrei Betlen
|
b6a9a0b6ba
|
Add types for all low-level api functions
|
2023-05-05 12:22:27 -04:00 |
|
Andrei Betlen
|
5be0efa5f8
|
Cache should raise KeyError when key is missing
|
2023-05-05 12:21:49 -04:00 |
|
Andrei Betlen
|
853dc711cc
|
Format
|
2023-05-04 21:58:36 -04:00 |
|
Andrei Betlen
|
97c6372350
|
Rewind model to longest prefix.
|
2023-05-04 21:58:27 -04:00 |
|
Andrei Betlen
|
329297fafb
|
Bugfix: Missing logits_to_logprobs
|
2023-05-04 12:18:40 -04:00 |
|
Andrei Betlen
|
9e5b6d675a
|
Improve logging messages
|
2023-05-03 10:28:10 -04:00 |
|
Andrei Betlen
|
43f2907e3a
|
Support smaller state sizes
|
2023-05-03 09:33:50 -04:00 |
|
Andrei Betlen
|
dd9ad1c759
|
Formatting
|
2023-05-01 21:51:16 -04:00 |
|
Andrei Betlen
|
b6747f722e
|
Fix logprob calculation. Fixes #134
|
2023-05-01 17:45:08 -04:00 |
|
Andrei Betlen
|
350a1769e1
|
Update sampling api
|
2023-05-01 14:47:55 -04:00 |
|
Mug
|
18a0c10032
|
Remove excessive errors="ignore" and add utf8 test
|
2023-04-29 12:19:22 +02:00 |
|
Mug
|
b7d14efc8b
|
Python weirdness
|
2023-04-28 13:20:31 +02:00 |
|
Mug
|
eed61289b6
|
Dont detect off tokens, detect off detokenized utf8
|
2023-04-28 13:16:18 +02:00 |
|
Mug
|
3a98747026
|
One day, i'll fix off by 1 errors permanently too
|
2023-04-28 12:54:28 +02:00 |
|
Mug
|
c39547a986
|
Detect multi-byte responses and wait
|
2023-04-28 12:50:30 +02:00 |
|
Mug
|
5f81400fcb
|
Also ignore errors on input prompts
|
2023-04-26 14:45:51 +02:00 |
|
Mug
|
be2c961bc9
|
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python
|
2023-04-26 14:38:09 +02:00 |
|
Mug
|
c4a8491d42
|
Fix decode errors permanently
|
2023-04-26 14:37:06 +02:00 |
|
Andrei Betlen
|
cc706fb944
|
Add ctx check and re-order __init__. Closes #112
|
2023-04-25 09:00:53 -04:00 |
|
Andrei Betlen
|
d484c5634e
|
Bugfix: Check cache keys as prefix to prompt tokens
|
2023-04-24 22:18:54 -04:00 |
|
Andrei Betlen
|
cbe95bbb75
|
Add cache implementation using llama state
|
2023-04-24 19:54:41 -04:00 |
|