Commit graph

199 commits

Author SHA1 Message Date
Andrei Betlen
cbac19bf24 Add winmode arg only on windows if python version supports it 2023-05-15 09:15:01 -04:00
Andrei Betlen
c804efe3f0 Fix obscure Wndows DLL issue. Closes #208 2023-05-14 22:08:11 -04:00
Andrei Betlen
cdf59768f5 Update llama.cpp 2023-05-14 00:04:22 -04:00
Andrei Betlen
7a536e86c2 Allow model to tokenize strings longer than context length and set add_bos. Closes #92 2023-05-12 14:28:22 -04:00
Andrei Betlen
8740ddc58e Only support generating one prompt at a time. 2023-05-12 07:21:46 -04:00
Andrei Betlen
8895b9002a Revert "llama_cpp server: prompt is a string". Closes #187
This reverts commit b9098b0ef7.
2023-05-12 07:16:57 -04:00
Andrei Betlen
7be584fe82 Add missing tfs_z paramter 2023-05-11 21:56:19 -04:00
Andrei Betlen
cdeaded251 Bugfix: Ensure logs are printed when streaming 2023-05-10 16:12:17 -04:00
Lucas Doyle
02e8a018ae llama_cpp server: document presence_penalty and frequency_penalty, mark as supported 2023-05-09 16:25:00 -07:00
Andrei Betlen
d957422bf4 Implement sampling as in llama.cpp main example 2023-05-08 21:21:25 -04:00
Andrei Betlen
93a9019bb1 Merge branch 'main' of github.com:abetlen/llama_cpp_python into Maximilian-Winter/main 2023-05-08 19:57:09 -04:00
Andrei Betlen
82d138fe54 Fix: default repeat_penalty 2023-05-08 18:49:11 -04:00
Andrei Betlen
29f094bbcf Bugfix: not falling back to environment variables when default is value is set. 2023-05-08 14:46:25 -04:00
Andrei Betlen
0d6c60097a Show default value when --help is called 2023-05-08 14:21:15 -04:00
Andrei Betlen
022e9ebcb8 Use environment variable if parsed cli arg is None 2023-05-08 14:20:53 -04:00
Andrei Betlen
0d751a69a7 Set repeat_penalty to 0 by default 2023-05-08 01:50:43 -04:00
Andrei Betlen
65d9cc050c Add openai frequency and presence penalty parameters. Closes #169 2023-05-08 01:30:18 -04:00
Andrei Betlen
a0b61ea2a7 Bugfix for models endpoint 2023-05-07 20:17:52 -04:00
Andrei Betlen
e72f58614b Change pointer to lower overhead byref 2023-05-07 20:01:34 -04:00
Andrei Betlen
14da46f16e Added cache size to settins object. 2023-05-07 19:33:17 -04:00
Andrei Betlen
0e94a70de1 Add in-memory longest prefix cache. Closes #158 2023-05-07 19:31:26 -04:00
Andrei Betlen
8dfde63255 Fix return type 2023-05-07 19:30:14 -04:00
Andrei Betlen
2753b85321 Format 2023-05-07 13:19:56 -04:00
Andrei Betlen
627811ea83 Add verbose flag to server 2023-05-07 05:09:10 -04:00
Andrei Betlen
3fbda71790 Fix mlock_supported and mmap_supported return type 2023-05-07 03:04:22 -04:00
Andrei Betlen
5a3413eee3 Update cpu_count 2023-05-07 03:03:57 -04:00
Andrei Betlen
1a00e452ea Update settings fields and defaults 2023-05-07 02:52:20 -04:00
Andrei Betlen
86753976c4 Revert "llama_cpp server: delete some ignored / unused parameters"
This reverts commit b47b9549d5.
2023-05-07 02:02:34 -04:00
Andrei Betlen
c382d8f86a Revert "llama_cpp server: mark model as required"
This reverts commit e40fcb0575.
2023-05-07 02:00:22 -04:00
Andrei Betlen
d8fddcce73 Merge branch 'main' of github.com:abetlen/llama_cpp_python into better-server-params-and-fields 2023-05-07 01:54:00 -04:00
Andrei Betlen
7c3743fe5f Update llama.cpp 2023-05-07 00:12:47 -04:00
Andrei Betlen
bc853e3742 Fix type for eval_logits in LlamaState object 2023-05-06 21:32:50 -04:00
Maximilian Winter
515d9bde7e Fixed somethings and activated cublas 2023-05-06 23:40:19 +02:00
Maximilian Winter
aa203a0d65 Added mirostat sampling to the high level API. 2023-05-06 22:47:47 +02:00
Andrei Betlen
98bbd1c6a8 Fix eval logits type 2023-05-05 14:23:14 -04:00
Andrei Betlen
b5f3e74627 Add return type annotations for embeddings and logits 2023-05-05 14:22:55 -04:00
Andrei Betlen
3e28e0e50c Fix: runtime type errors 2023-05-05 14:12:26 -04:00
Andrei Betlen
e24c3d7447 Prefer explicit imports 2023-05-05 14:05:31 -04:00
Andrei Betlen
40501435c1 Fix: types 2023-05-05 14:04:12 -04:00
Andrei Betlen
66e28eb548 Fix temperature bug 2023-05-05 14:00:41 -04:00
Andrei Betlen
6702d2abfd Fix candidates type 2023-05-05 14:00:30 -04:00
Andrei Betlen
5e7ddfc3d6 Fix llama_cpp types 2023-05-05 13:54:22 -04:00
Andrei Betlen
b6a9a0b6ba Add types for all low-level api functions 2023-05-05 12:22:27 -04:00
Andrei Betlen
5be0efa5f8 Cache should raise KeyError when key is missing 2023-05-05 12:21:49 -04:00
Andrei Betlen
24fc38754b Add cli options to server. Closes #37 2023-05-05 12:08:28 -04:00
Andrei Betlen
853dc711cc Format 2023-05-04 21:58:36 -04:00
Andrei Betlen
97c6372350 Rewind model to longest prefix. 2023-05-04 21:58:27 -04:00
Andrei Betlen
329297fafb Bugfix: Missing logits_to_logprobs 2023-05-04 12:18:40 -04:00
Lucas Doyle
3008a954c1 Merge branch 'main' of github.com:abetlen/llama-cpp-python into better-server-params-and-fields 2023-05-03 13:10:03 -07:00
Andrei Betlen
9e5b6d675a Improve logging messages 2023-05-03 10:28:10 -04:00