Andrei Betlen
|
5a045fcbbc
|
Update llama.cpp
|
2023-10-19 17:37:07 -04:00 |
|
Andrei Betlen
|
ef03d77b59
|
Enable finish reason tests
|
2023-10-19 02:56:45 -04:00 |
|
gmcgoldr
|
09a8406c83
|
Fix streaming doesn't return finish reason (#798)
When streaming the yield that contains the finish can be skipped. This change ensures that yield isn't skipped.
|
2023-10-19 02:55:56 -04:00 |
|
Andrei Betlen
|
28c2b884e2
|
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
|
2023-10-19 02:55:31 -04:00 |
|
Andrei Betlen
|
cbeef36510
|
Re-enable tests completion function
|
2023-10-19 02:55:29 -04:00 |
|
Andrei Betlen
|
ff580031d2
|
Update llama.cpp
|
2023-10-19 02:55:08 -04:00 |
|
Xiaoyu Kevin Hu
|
a315128d66
|
update value check for n_gpu_layers field (#826)
|
2023-10-18 18:25:25 -04:00 |
|
Andrei Betlen
|
d989ac86e6
|
Update llama.cpp
|
2023-10-15 15:12:57 -04:00 |
|
Pierre Alexandre SCHEMBRI
|
10304d75fc
|
Make use of suppress_stdout_stderr when freeing model (#803)
|
2023-10-15 13:52:43 -04:00 |
|
Ma, Guokai
|
a1ac199980
|
Fix repeat greeting (#808)
* fix repeated greeting
* remove seperator between role and message
|
2023-10-15 13:52:21 -04:00 |
|
Eric Liu
|
b50166500e
|
Add validation for tensor_split size exceeding LLAMA_MAX_DEVICES (#820)
* Add validation for tensor_split size exceeding LLAMA_MAX_DEVICES
* reword
|
2023-10-15 13:51:51 -04:00 |
|
Andrei Betlen
|
f30aa20126
|
Update llama.cpp
|
2023-10-12 02:24:50 -04:00 |
|
Andrei Betlen
|
622bff19b2
|
Update llama.cpp
|
2023-10-10 19:23:35 -04:00 |
|
Andrei Betlen
|
d6a130a052
|
Print traceback on server error
|
2023-10-10 15:56:04 -04:00 |
|
Andrei Betlen
|
43dfe1e2ab
|
Update llama.cpp
|
2023-10-05 16:07:49 -04:00 |
|
Andrei Betlen
|
2c0456acf0
|
Update llama.cpp
|
2023-10-04 20:19:31 -04:00 |
|
Andrei Betlen
|
c305be6db6
|
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
|
2023-10-03 15:23:37 -04:00 |
|
Andrei Betlen
|
a7d17b8ac9
|
Update llama.cpp
|
2023-10-03 15:23:35 -04:00 |
|
ccshen
|
b76724cddc
|
Update instruction to download GGUF model (#783)
Co-authored-by: john.shen <john.shen@bioclinica.com>
|
2023-10-02 11:46:47 -04:00 |
|
Andrei Betlen
|
305482bd41
|
Add chatml chat format
|
2023-09-30 21:01:34 -04:00 |
|
Andrei Betlen
|
5ef5280ef9
|
Log server exceptions to stdout
|
2023-09-30 19:13:36 -04:00 |
|
Andrei Betlen
|
f0af1c7201
|
Update llama.cpp
|
2023-09-30 19:09:50 -04:00 |
|
Andrei Betlen
|
fab4bccc35
|
Bump version
|
2023-09-30 16:04:46 -04:00 |
|
Andrei Betlen
|
d696251fbe
|
Fix logits_all bug
|
2023-09-30 16:02:35 -04:00 |
|
Andrei Betlen
|
6ee413d79e
|
Bump version
|
2023-09-30 13:23:09 -04:00 |
|
Andrei Betlen
|
42bb721d64
|
Fix bug in embedding
|
2023-09-30 13:20:22 -04:00 |
|
Andrei Betlen
|
bca965325d
|
Update CHANGELOG
|
2023-09-30 00:08:45 -04:00 |
|
Andrei Betlen
|
5d62d55a82
|
Bump version
|
2023-09-30 00:07:06 -04:00 |
|
Andrei Betlen
|
ac853e01e1
|
Include git directories
|
2023-09-30 00:01:14 -04:00 |
|
Andrei Betlen
|
9e76613629
|
Remove git repo exclude
|
2023-09-29 23:28:59 -04:00 |
|
Andrei Betlen
|
b4939c2d99
|
Revert BUILD_NUMBER fix
|
2023-09-29 23:28:45 -04:00 |
|
Andrei Betlen
|
541aaff45e
|
Quote fix attempt #2
|
2023-09-29 23:05:26 -04:00 |
|
Andrei Betlen
|
39e5feb138
|
Fix quote issue
|
2023-09-29 23:01:38 -04:00 |
|
Andrei Betlen
|
3c6e98f945
|
Use dev versioning for test pypi
|
2023-09-29 22:57:49 -04:00 |
|
Andrei Betlen
|
1cca20304b
|
Revert update to publish test pypi
|
2023-09-29 22:48:17 -04:00 |
|
Andrei Betlen
|
85e4d08a2e
|
Update publish to test pypi workflow
|
2023-09-29 22:32:31 -04:00 |
|
Andrei Betlen
|
43f8fc371a
|
Potential fix for pip install bug
|
2023-09-29 22:24:22 -04:00 |
|
Andrei Betlen
|
386c88b68e
|
Bump version
|
2023-09-29 20:07:31 -04:00 |
|
Andrei Betlen
|
d9bce17794
|
Update server params
|
2023-09-29 19:59:12 -04:00 |
|
Andrei Betlen
|
3720c739d4
|
Update llama.cpp
|
2023-09-29 19:58:21 -04:00 |
|
Andrei
|
3bca7708fb
|
Configurable Chat Formats (#711)
* Add configurable default chat completion format.
* Remove chat_template file to avoid circular import
* Update llama_types
* Add chat format
|
2023-09-29 19:52:04 -04:00 |
|
Josh XT
|
a945404b4a
|
Fix rope scaling defaults (#767)
* Fix rope scale with backwards compatibility
* Fix defaults
* Fix op
* Remove backwards compatibility
* Check single val
|
2023-09-29 16:03:57 -04:00 |
|
Andrei Betlen
|
a72efc77de
|
Update llama.cpp
|
2023-09-28 23:25:14 -04:00 |
|
Andrei Betlen
|
1a1c3dc418
|
Update llama.cpp
|
2023-09-28 22:42:03 -04:00 |
|
Andrei Betlen
|
4177ae6d34
|
Bump version
|
2023-09-25 14:38:38 -04:00 |
|
Andrei Betlen
|
1ed0f3ebe1
|
Bump scikit-build-core version to one that includes fix for windows cmake.
|
2023-09-25 14:20:09 -04:00 |
|
Andrei Betlen
|
f7b785a00f
|
Update CHANGELOG
|
2023-09-25 13:58:23 -04:00 |
|
Andrei Betlen
|
cf8ae5a69c
|
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
|
2023-09-25 13:57:00 -04:00 |
|
Andrei Betlen
|
5da57734bc
|
Update llama.cpp
|
2023-09-25 13:56:52 -04:00 |
|
Viacheslav/Slava Tradunsky
|
3d5e5b1c04
|
Adds openai-processing-ms response header (#748)
|
2023-09-25 13:55:58 -04:00 |
|