Commit graph

182 commits

Author SHA1 Message Date
Limour
f165048a69
feat: add support for KV cache quantization options (#1307)
* add KV cache quantization options

https://github.com/abetlen/llama-cpp-python/discussions/1220
https://github.com/abetlen/llama-cpp-python/issues/1305

* Add ggml_type

* Use ggml_type instead of string for quantization

* Add server support

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-04-01 10:19:28 -04:00
Andrei Betlen
125b2358c9 feat: Update llama.cpp 2024-03-28 12:06:46 -04:00
Andrei Betlen
901fe02461 feat: Update llama.cpp 2024-03-26 22:58:53 -04:00
Andrei Betlen
e325a831f0 feat: Update llama.cpp 2024-03-22 23:43:29 -04:00
Andrei Betlen
8d298b4750 feat: Update llama.cpp 2024-03-18 10:26:36 -04:00
Andrei Betlen
6eb25231e4 feat: Update llama.cpp 2024-03-15 12:58:45 -04:00
Andrei Betlen
d318cc8b83 fix: Set default pooling_type to mean, check for null pointer. 2024-03-14 09:17:41 -04:00
Andrei Betlen
dd0ee56217 feat: Update llama.cpp 2024-03-13 15:57:35 -04:00
Andrei Betlen
08e910f7a7 feat: Update llama.cpp 2024-03-10 23:45:05 -04:00
Andrei Betlen
40c6b54f68 feat: Update llama.cpp 2024-03-08 20:58:50 -05:00
Andrei Betlen
93dc56ace8 Update llama.cpp 2024-03-06 01:32:00 -05:00
Andrei Betlen
87a6e5797e feat: Update llama.cpp 2024-03-03 11:27:04 -05:00
Andrei Betlen
0e70984fb6 feat: Update llama.cpp 2024-03-02 22:20:04 -05:00
Andrei Betlen
f062a7f51d feat: Update llama.cpp 2024-03-01 12:57:16 -05:00
Andrei Betlen
8c71725d53 fix: Remove deprecated cfg sampling functions 2024-02-28 14:37:07 -05:00
Andrei Betlen
0d37ce52b1 feat: Update llama.cpp 2024-02-28 14:27:16 -05:00
Andrei Betlen
fea33c9b94 feat: Update llama.cpp 2024-02-27 12:22:17 -05:00
Andrei Betlen
9558ce7878 feat: Update llama.cpp 2024-02-26 11:40:58 -05:00
Andrei Betlen
cbbcd888af feat: Update llama.cpp 2024-02-25 20:52:14 -05:00
Andrei Betlen
19234aa0db fix: Restore type hints for low-level api 2024-02-25 16:54:37 -05:00
Andrei Betlen
2292af5796 feat: Update llama.cpp 2024-02-25 16:53:58 -05:00
Andrei Betlen
221edb9ef1 feat: Update llama.cpp 2024-02-24 23:47:29 -05:00
Andrei Betlen
a0ce429dc0 misc: use decorator to bind low level api functions, fixes docs 2024-02-23 03:39:38 -05:00
Andrei Betlen
e10af30cf1 fix: TypeAlias import error 2024-02-22 03:27:28 -05:00
Andrei Betlen
aefcb8f71a misc: additional type annotations for low level api 2024-02-22 02:00:09 -05:00
Andrei Betlen
0653e15c20 feat: Update llama.cpp 2024-02-21 23:04:52 -05:00
Andrei
7f51b6071f
feat(low-level-api): Improve API static type-safety and performance (#1205) 2024-02-21 16:25:38 -05:00
Andrei Betlen
4edde21b3d feat: Update llama.cpp 2024-02-21 11:05:58 -05:00
Andrei Betlen
6225f027e5 feat: Update llama.cpp 2024-02-19 04:11:34 -05:00
Andrei Betlen
748c0ce057 feat: Update llama.cpp 2024-02-18 21:30:36 -05:00
Andrei Betlen
fdce078cb9 feat: Update llama.cpp 2024-02-17 00:37:51 -05:00
Andrei Betlen
a5cfeb7763 feat: Update llama.cpp 2024-02-15 15:17:30 -05:00
Andrei Betlen
f7cdf78788 Update llama.cpp 2024-02-13 12:24:00 -05:00
Andrei Betlen
69413ce08e Update llama.cpp 2024-02-11 19:00:17 -05:00
Andrei Betlen
3553b14670 Update llama.cpp 2024-02-05 13:26:50 -05:00
Andrei Betlen
71e3e4c435 Update llama.cpp 2024-01-31 10:41:42 -05:00
Andrei Betlen
011cd84ded Update llama.cpp 2024-01-30 09:48:09 -05:00
Andrei Betlen
35918873b4 Update llama.cpp 2024-01-26 11:45:48 -05:00
Andrei Betlen
c970d41a85 fix: llama_log_set should be able to accept null pointer 2024-01-24 10:38:30 -05:00
Andrei Betlen
fcdf337d84 Update llama.cpp 2024-01-22 11:25:11 -05:00
Andrei Betlen
89cce50f8c Update llama.cpp 2024-01-18 21:21:49 -05:00
Andrei Betlen
5502ac8876 Update llama.cpp 2024-01-15 10:12:10 -05:00
Andrei Betlen
359ae73643 Update llama.cpp 2024-01-14 08:17:22 -05:00
Andrei Betlen
7c898d5684 Update llama.cpp 2024-01-13 22:37:49 -05:00
Andrei Betlen
bb610b9428 Update llama.cpp 2024-01-11 22:51:12 -05:00
Andrei Betlen
1ae05c102b Update llama.cpp 2024-01-08 14:51:29 -05:00
Andrei Betlen
eb9c7d4ed8 Update llama.cpp 2024-01-03 22:04:04 -05:00
Andrei Betlen
92284f32cb Add HIP_PATH to dll search directories for windows users. 2023-12-22 15:29:56 -05:00
Andrei Betlen
2b0d3f36fa set llama_max_devices using library function 2023-12-22 15:19:28 -05:00
Andrei Betlen
6d8bc090f9 fix: inccorect bindings for kv override. Based on #1011 2023-12-22 14:52:20 -05:00