Andrei Betlen
a05b4da80a
fix: float32 is not JSON serializable when streaming logits.
2023-12-18 18:40:36 -05:00
Andrei Betlen
abda047284
Update changelog
2023-12-18 18:16:17 -05:00
Andrei Betlen
7df6c32544
Fix type annotations
2023-12-18 18:14:53 -05:00
Andrei Betlen
b703aad79e
Fix type annotation
2023-12-18 18:13:37 -05:00
Andrei Betlen
d0aedfcff6
Fix type annotation
2023-12-18 18:12:49 -05:00
Eduard Christian Dumitrescu
2993936b10
Fix ctypes definitions of llama_kv_cache_view_update
and llama_kv_cache_view_free
. ( #1028 )
2023-12-18 18:11:26 -05:00
Andrei Betlen
5e863d8a3b
Bump version
2023-12-18 16:09:18 -05:00
Jonathan Soma
cfd698c75c
Update low_level_api_llama_cpp.py to match current API ( #1023 )
2023-12-18 15:59:11 -05:00
Andrei Betlen
095c650006
Add offload_kqv option to llama and server
2023-12-18 15:36:09 -05:00
Andrei Betlen
472b344ae3
Remove unnused import
2023-12-18 15:32:40 -05:00
Andrei Betlen
2fc48c54be
Update llama.cpp
2023-12-18 15:32:15 -05:00
kddubey
6b2e0e05b4
perf: Don't convert logprobs arrays to lists ( #1021 )
2023-12-18 14:28:12 -05:00
Brandon Roberts
62944df142
Bugfix: Remove f16_kv, add offload_kqv field ( #1019 )
...
F16_KV appears to have been removed here: af99c6fbfc
This addresses two issues:
- #995 which just requests to add the KV cache offloading param
- #1006 a NULL ptr exception when using the embeddings (introduced by
leaving f16_kv in the fields struct)
2023-12-18 14:27:11 -05:00
evelynmitchell
37da8e863a
Update README.md functionary demo typo ( #996 )
...
missing comma
2023-12-16 19:00:30 -05:00
Daniele Morotti
f1c631dc53
Bug fixed with n_ctx=0 ( #1015 )
...
If the n_ctx is set to 0 the code should use the maximum context length of the selected model, but it didn't work. There was a problem with the initialization of this parameter and a related problem with 'n_batch'.
2023-12-16 18:59:50 -05:00
kddubey
5a8944672f
Fix logits_to_logprobs for 2-D and 3-D logits ( #1002 )
...
* Fix logits_to_logprobs for 2-D and 3-D logits
* Set dtype to single
* Test size
2023-12-16 18:59:26 -05:00
Andrei Betlen
534b1ea9b5
Update llama.cpp
2023-12-16 18:57:43 -05:00
Andrei Betlen
cbce061ffd
Bump version
2023-12-13 21:52:29 -05:00
yhfgyyf
8b4db732bd
Add qwen chat format ( #1005 )
2023-12-13 21:43:43 -05:00
Andrei Betlen
690c563b60
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
2023-12-13 21:43:19 -05:00
Andrei Betlen
c0fc0a1e82
Update llama.cpp
2023-12-13 21:43:16 -05:00
Radoslav Gerganov
8e44a32075
Add support for running the server with SSL ( #994 )
2023-12-11 20:47:11 -05:00
Tanner Hobson
ef22e478db
Replace logits_to_logprobs implementation with numpy equivalent to llama.cpp ( #991 )
...
See #990 . This change makes the logits_to_logprobs function equivalent to the version in the llama.cpp repository. It uses numpy so it's much faster than the previous version.
2023-12-11 20:46:27 -05:00
zocainViken
ac35f68e4d
Fix UnsupportedOperation: fileno in suppress_stdout_stderr ( #961 )
...
* bug fixing
* llava from readme got this error: UnsupportedOperation: fileno quick fix by checking hasattr
* multi modal params fix: add logits = True -> to make llava work
* multi modal params fix: add logits = True -> to make llava work
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2023-12-11 20:44:51 -05:00
chiensen
b938cccf05
Add Pygmalion chat format ( #986 )
2023-12-11 20:44:04 -05:00
zocainViken
6bbeea07ae
README.md multimodal params fix ( #967 )
...
multi modal params fix: add logits = True -> to make llava work
2023-12-11 20:41:38 -05:00
Aniket Maurya
c1d92ce680
fix minor typo ( #958 )
...
* fix minor typo
* Fix typo
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2023-12-11 20:40:38 -05:00
Andrei Betlen
e9bc4c4baf
Fix docker build
2023-12-11 10:39:51 -05:00
Andrei Betlen
c1e73e73a3
Bump version
2023-12-11 10:26:42 -05:00
Andrei Betlen
ec26f364cc
Remove f16_kv
2023-12-11 10:25:37 -05:00
Andrei Betlen
f1edc66b21
Update llama.cpp
2023-12-11 10:21:35 -05:00
Andrei Betlen
f3b844ed0a
Update llama.cpp
2023-11-29 05:40:22 -05:00
kddubey
b069d06346
Fix #891 ( #952 )
2023-11-29 05:39:52 -05:00
Andrei Betlen
ad963a0961
Bump version
2023-11-28 04:58:20 -05:00
Andrei Betlen
e3941d9c67
Make building llava optional
2023-11-28 04:55:21 -05:00
Andrei Betlen
74f1949206
Update llama.cpp
2023-11-28 04:54:51 -05:00
Andrei Betlen
fb32f9d438
docs: Update README
2023-11-28 03:15:01 -05:00
Andrei Betlen
43e006a291
docs: Remove divider
2023-11-28 02:41:50 -05:00
Andrei Betlen
2cc6c9ae2f
docs: Update README, add FAQ
2023-11-28 02:37:34 -05:00
Andrei Betlen
7f3704b896
Bump version
2023-11-27 19:14:25 -05:00
Andrei Betlen
f99b2385ee
Update llama.cpp
2023-11-27 19:03:10 -05:00
Andrei Betlen
396dbf0b2b
docs: Improve low-level docstrings
2023-11-27 19:03:02 -05:00
Andrei Betlen
9c68b1804a
docs: Add api reference links in README
2023-11-27 18:54:07 -05:00
Andrei Betlen
174ef3ddf6
docs: Add headings to API reference
2023-11-27 18:42:15 -05:00
Andrei Betlen
41428244f0
docs: Fix README indentation
2023-11-27 18:29:13 -05:00
Andrei Betlen
1539146a5e
docs: Fix README docs link
2023-11-27 18:21:00 -05:00
Andrei Betlen
a928893d03
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
2023-11-26 15:57:13 -05:00
Andrei Betlen
6308f21d5e
docs: Update Llama docs
2023-11-26 15:56:40 -05:00
Anton Vice
aa5a7a1880
Update README.md ( #940 )
...
.ccp >> .cpp
2023-11-26 15:39:38 -05:00
Gardner Bickford
c2d63a7148
fix: Typo in the Open Orca chat format #874 ( #947 )
2023-11-26 15:39:18 -05:00