Commit graph

461 commits

Author SHA1 Message Date
Andrei Betlen
341c50b5b0 Fix CMakeLists.txt 2023-05-16 09:07:14 -04:00
Andrei Betlen
cbac19bf24 Add winmode arg only on windows if python version supports it 2023-05-15 09:15:01 -04:00
Andrei Betlen
c804efe3f0 Fix obscure Wndows DLL issue. Closes #208 2023-05-14 22:08:11 -04:00
Andrei Betlen
ceec21f1e9 Update llama.cpp 2023-05-14 22:07:35 -04:00
Andrei Betlen
d90c9df326 Bump version 2023-05-14 00:04:49 -04:00
Andrei Betlen
cdf59768f5 Update llama.cpp 2023-05-14 00:04:22 -04:00
Andrei Betlen
7a536e86c2 Allow model to tokenize strings longer than context length and set add_bos. Closes #92 2023-05-12 14:28:22 -04:00
Andrei Betlen
8740ddc58e Only support generating one prompt at a time. 2023-05-12 07:21:46 -04:00
Andrei Betlen
8895b9002a Revert "llama_cpp server: prompt is a string". Closes #187
This reverts commit b9098b0ef7.
2023-05-12 07:16:57 -04:00
Andrei Betlen
684d7c8c17 Fix docker command 2023-05-11 22:12:35 -04:00
Andrei Betlen
fa1fc4ec42 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-05-11 21:56:54 -04:00
Andrei Betlen
e3d3c31da2 Bump version 2023-05-11 21:56:43 -04:00
Andrei Betlen
7be584fe82 Add missing tfs_z paramter 2023-05-11 21:56:19 -04:00
Andrei Betlen
28ee2adec2 Update llama.cpp 2023-05-11 21:15:12 -04:00
Andrei Betlen
35229f5eab Update llama.cpp 2023-05-11 10:05:34 -04:00
Andrei Betlen
cdeaded251 Bugfix: Ensure logs are printed when streaming 2023-05-10 16:12:17 -04:00
Andrei
c3ed1330d7
Merge pull request #177 from joelkurian/main
Updated installation instructions for BLAS backends
2023-05-10 05:27:12 -04:00
Andrei
3c96b43cf4
Merge pull request #178 from Stonelinks/document-presence-frequency-penalty
Document presence frequency penalty
2023-05-09 23:55:52 -04:00
Lucas Doyle
02e8a018ae llama_cpp server: document presence_penalty and frequency_penalty, mark as supported 2023-05-09 16:25:00 -07:00
Joel Kurian
17dc51a7d2 Updated installation instructions for BLAS backends 2023-05-09 21:34:46 +05:30
Andrei Betlen
d957422bf4 Implement sampling as in llama.cpp main example 2023-05-08 21:21:25 -04:00
Andrei Betlen
93a9019bb1 Merge branch 'main' of github.com:abetlen/llama_cpp_python into Maximilian-Winter/main 2023-05-08 19:57:09 -04:00
Andrei Betlen
f315b82832 Revert changes to llama.cpp and setup.py 2023-05-08 19:53:21 -04:00
Andrei
7499fc1cbb
Merge pull request #126 from Stonelinks/deprecate-example-server
Deprecate example server
2023-05-08 19:29:04 -04:00
Andrei
1971514fa5
Merge pull request #173 from abetlen/dependabot/pip/mkdocs-material-9.1.11
Bump mkdocs-material from 9.1.9 to 9.1.11
2023-05-08 19:28:01 -04:00
Andrei Betlen
7af1f4c672 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-05-08 18:49:38 -04:00
Andrei Betlen
c37883b477 Bump version 2023-05-08 18:49:37 -04:00
Andrei Betlen
82d138fe54 Fix: default repeat_penalty 2023-05-08 18:49:11 -04:00
dependabot[bot]
b1489befda
Bump mkdocs-material from 9.1.9 to 9.1.11
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.1.9 to 9.1.11.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.1.9...9.1.11)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-08 21:04:42 +00:00
Andrei
ed0f48b4bb
Merge pull request #153 from SagsMug/main
Update low_level_api examples
2023-05-08 14:58:47 -04:00
Andrei Betlen
a3cc7bf5b2 Bump version 2023-05-08 14:46:50 -04:00
Andrei Betlen
29f094bbcf Bugfix: not falling back to environment variables when default is value is set. 2023-05-08 14:46:25 -04:00
Andrei Betlen
6d69461ef5 Bump version 2023-05-08 14:21:47 -04:00
Andrei Betlen
0d6c60097a Show default value when --help is called 2023-05-08 14:21:15 -04:00
Andrei Betlen
022e9ebcb8 Use environment variable if parsed cli arg is None 2023-05-08 14:20:53 -04:00
Mug
eaf9f19aa9 Fix lora 2023-05-08 15:27:42 +02:00
Mug
2c0d9b182c Fix session loading and saving in low level example chat 2023-05-08 15:27:03 +02:00
Mug
ed66a469c9 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python 2023-05-08 14:49:48 +02:00
Andrei Betlen
0d751a69a7 Set repeat_penalty to 0 by default 2023-05-08 01:50:43 -04:00
Andrei Betlen
65d9cc050c Add openai frequency and presence penalty parameters. Closes #169 2023-05-08 01:30:18 -04:00
Andrei Betlen
75d8619b1a Bump version 2023-05-07 20:19:34 -04:00
Andrei Betlen
a0b61ea2a7 Bugfix for models endpoint 2023-05-07 20:17:52 -04:00
Andrei Betlen
e72f58614b Change pointer to lower overhead byref 2023-05-07 20:01:34 -04:00
Andrei Betlen
14da46f16e Added cache size to settins object. 2023-05-07 19:33:17 -04:00
Andrei Betlen
0e94a70de1 Add in-memory longest prefix cache. Closes #158 2023-05-07 19:31:26 -04:00
Andrei Betlen
8dfde63255 Fix return type 2023-05-07 19:30:14 -04:00
Andrei Betlen
2753b85321 Format 2023-05-07 13:19:56 -04:00
Andrei Betlen
4f8cf52a38 Update README 2023-05-07 05:20:04 -04:00
Andrei Betlen
3adc8fb3ae Update README to use cli options for server 2023-05-07 05:10:52 -04:00
Andrei Betlen
627811ea83 Add verbose flag to server 2023-05-07 05:09:10 -04:00