Andrei Betlen
|
a65125c0bd
|
Add sampling defaults for generate
|
2023-05-16 09:35:50 -04:00 |
|
Andrei Betlen
|
cbac19bf24
|
Add winmode arg only on windows if python version supports it
|
2023-05-15 09:15:01 -04:00 |
|
Andrei Betlen
|
c804efe3f0
|
Fix obscure Wndows DLL issue. Closes #208
|
2023-05-14 22:08:11 -04:00 |
|
Andrei Betlen
|
cdf59768f5
|
Update llama.cpp
|
2023-05-14 00:04:22 -04:00 |
|
Andrei Betlen
|
7a536e86c2
|
Allow model to tokenize strings longer than context length and set add_bos. Closes #92
|
2023-05-12 14:28:22 -04:00 |
|
Andrei Betlen
|
8740ddc58e
|
Only support generating one prompt at a time.
|
2023-05-12 07:21:46 -04:00 |
|
Andrei Betlen
|
8895b9002a
|
Revert "llama_cpp server: prompt is a string". Closes #187
This reverts commit b9098b0ef7 .
|
2023-05-12 07:16:57 -04:00 |
|
Andrei Betlen
|
7be584fe82
|
Add missing tfs_z paramter
|
2023-05-11 21:56:19 -04:00 |
|
Andrei Betlen
|
cdeaded251
|
Bugfix: Ensure logs are printed when streaming
|
2023-05-10 16:12:17 -04:00 |
|
Lucas Doyle
|
02e8a018ae
|
llama_cpp server: document presence_penalty and frequency_penalty, mark as supported
|
2023-05-09 16:25:00 -07:00 |
|
Andrei Betlen
|
d957422bf4
|
Implement sampling as in llama.cpp main example
|
2023-05-08 21:21:25 -04:00 |
|
Andrei Betlen
|
93a9019bb1
|
Merge branch 'main' of github.com:abetlen/llama_cpp_python into Maximilian-Winter/main
|
2023-05-08 19:57:09 -04:00 |
|
Andrei Betlen
|
82d138fe54
|
Fix: default repeat_penalty
|
2023-05-08 18:49:11 -04:00 |
|
Andrei Betlen
|
29f094bbcf
|
Bugfix: not falling back to environment variables when default is value is set.
|
2023-05-08 14:46:25 -04:00 |
|
Andrei Betlen
|
0d6c60097a
|
Show default value when --help is called
|
2023-05-08 14:21:15 -04:00 |
|
Andrei Betlen
|
022e9ebcb8
|
Use environment variable if parsed cli arg is None
|
2023-05-08 14:20:53 -04:00 |
|
Andrei Betlen
|
0d751a69a7
|
Set repeat_penalty to 0 by default
|
2023-05-08 01:50:43 -04:00 |
|
Andrei Betlen
|
65d9cc050c
|
Add openai frequency and presence penalty parameters. Closes #169
|
2023-05-08 01:30:18 -04:00 |
|
Andrei Betlen
|
a0b61ea2a7
|
Bugfix for models endpoint
|
2023-05-07 20:17:52 -04:00 |
|
Andrei Betlen
|
e72f58614b
|
Change pointer to lower overhead byref
|
2023-05-07 20:01:34 -04:00 |
|
Andrei Betlen
|
14da46f16e
|
Added cache size to settins object.
|
2023-05-07 19:33:17 -04:00 |
|
Andrei Betlen
|
0e94a70de1
|
Add in-memory longest prefix cache. Closes #158
|
2023-05-07 19:31:26 -04:00 |
|
Andrei Betlen
|
8dfde63255
|
Fix return type
|
2023-05-07 19:30:14 -04:00 |
|
Andrei Betlen
|
2753b85321
|
Format
|
2023-05-07 13:19:56 -04:00 |
|
Andrei Betlen
|
627811ea83
|
Add verbose flag to server
|
2023-05-07 05:09:10 -04:00 |
|
Andrei Betlen
|
3fbda71790
|
Fix mlock_supported and mmap_supported return type
|
2023-05-07 03:04:22 -04:00 |
|
Andrei Betlen
|
5a3413eee3
|
Update cpu_count
|
2023-05-07 03:03:57 -04:00 |
|
Andrei Betlen
|
1a00e452ea
|
Update settings fields and defaults
|
2023-05-07 02:52:20 -04:00 |
|
Andrei Betlen
|
86753976c4
|
Revert "llama_cpp server: delete some ignored / unused parameters"
This reverts commit b47b9549d5 .
|
2023-05-07 02:02:34 -04:00 |
|
Andrei Betlen
|
c382d8f86a
|
Revert "llama_cpp server: mark model as required"
This reverts commit e40fcb0575 .
|
2023-05-07 02:00:22 -04:00 |
|
Andrei Betlen
|
d8fddcce73
|
Merge branch 'main' of github.com:abetlen/llama_cpp_python into better-server-params-and-fields
|
2023-05-07 01:54:00 -04:00 |
|
Andrei Betlen
|
7c3743fe5f
|
Update llama.cpp
|
2023-05-07 00:12:47 -04:00 |
|
Andrei Betlen
|
bc853e3742
|
Fix type for eval_logits in LlamaState object
|
2023-05-06 21:32:50 -04:00 |
|
Maximilian Winter
|
515d9bde7e
|
Fixed somethings and activated cublas
|
2023-05-06 23:40:19 +02:00 |
|
Maximilian Winter
|
aa203a0d65
|
Added mirostat sampling to the high level API.
|
2023-05-06 22:47:47 +02:00 |
|
Andrei Betlen
|
98bbd1c6a8
|
Fix eval logits type
|
2023-05-05 14:23:14 -04:00 |
|
Andrei Betlen
|
b5f3e74627
|
Add return type annotations for embeddings and logits
|
2023-05-05 14:22:55 -04:00 |
|
Andrei Betlen
|
3e28e0e50c
|
Fix: runtime type errors
|
2023-05-05 14:12:26 -04:00 |
|
Andrei Betlen
|
e24c3d7447
|
Prefer explicit imports
|
2023-05-05 14:05:31 -04:00 |
|
Andrei Betlen
|
40501435c1
|
Fix: types
|
2023-05-05 14:04:12 -04:00 |
|
Andrei Betlen
|
66e28eb548
|
Fix temperature bug
|
2023-05-05 14:00:41 -04:00 |
|
Andrei Betlen
|
6702d2abfd
|
Fix candidates type
|
2023-05-05 14:00:30 -04:00 |
|
Andrei Betlen
|
5e7ddfc3d6
|
Fix llama_cpp types
|
2023-05-05 13:54:22 -04:00 |
|
Andrei Betlen
|
b6a9a0b6ba
|
Add types for all low-level api functions
|
2023-05-05 12:22:27 -04:00 |
|
Andrei Betlen
|
5be0efa5f8
|
Cache should raise KeyError when key is missing
|
2023-05-05 12:21:49 -04:00 |
|
Andrei Betlen
|
24fc38754b
|
Add cli options to server. Closes #37
|
2023-05-05 12:08:28 -04:00 |
|
Andrei Betlen
|
853dc711cc
|
Format
|
2023-05-04 21:58:36 -04:00 |
|
Andrei Betlen
|
97c6372350
|
Rewind model to longest prefix.
|
2023-05-04 21:58:27 -04:00 |
|
Andrei Betlen
|
329297fafb
|
Bugfix: Missing logits_to_logprobs
|
2023-05-04 12:18:40 -04:00 |
|
Lucas Doyle
|
3008a954c1
|
Merge branch 'main' of github.com:abetlen/llama-cpp-python into better-server-params-and-fields
|
2023-05-03 13:10:03 -07:00 |
|