Commit graph

145 commits

Author SHA1 Message Date
Andrei Betlen
baa825dacb Add windows and mac runners 2023-04-06 21:27:01 -04:00
Andrei Betlen
da539cc2ee Safer calculation of default n_threads 2023-04-06 21:22:19 -04:00
Andrei Betlen
9b7526895d Bump version 2023-04-06 21:19:08 -04:00
Andrei Betlen
7851cc1e3c Don't install pydantic by default 2023-04-06 21:10:34 -04:00
Andrei Betlen
09707f5b2a Remove console script 2023-04-06 21:08:32 -04:00
Andrei Betlen
930db37dd2 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-04-06 21:07:38 -04:00
Andrei Betlen
55279b679d Handle prompt list 2023-04-06 21:07:35 -04:00
Andrei
c2e690b326
Merge pull request #29 from MillionthOdin16/main
Fixes and Tweaks to Defaults
2023-04-06 21:06:31 -04:00
MillionthOdin16
2e91affea2 Ignore ./idea folder 2023-04-05 18:23:17 -04:00
MillionthOdin16
c283edd7f2 Set n_batch to default values and reduce thread count:
Change batch size to the llama.cpp default of 8. I've seen issues in llama.cpp where batch size affects quality of generations. (It shouldn't) But in case that's still an issue I changed to default.

Set auto-determined num of threads to 1/2 system count. ggml will sometimes lock cores at 100% while doing nothing. This is being addressed, but can cause bad experience for user if pegged at 100%
2023-04-05 18:17:29 -04:00
MillionthOdin16
b9b6dfd23f Merge remote-tracking branch 'origin/main' 2023-04-05 17:51:43 -04:00
MillionthOdin16
76a82babef Set n_batch to the default value of 8. I think this is leftover from when n_ctx was missing and n_batch was 2048. 2023-04-05 17:44:53 -04:00
Andrei Betlen
38f7dea6ca Update README and docs 2023-04-05 17:44:25 -04:00
MillionthOdin16
1e90597983 Add pydantic dep. Errors if pedantic isn't present. Also throws errors relating to TypeDict or subclass() if the version is too old or new... 2023-04-05 17:37:06 -04:00
Andrei Betlen
267d3648fc Bump version 2023-04-05 16:26:22 -04:00
Andrei Betlen
74bf043ddd Update llama.cpp 2023-04-05 16:25:54 -04:00
Andrei Betlen
44448fb3a8 Add server as a subpackage 2023-04-05 16:23:25 -04:00
Andrei Betlen
e1b5b9bb04 Update fastapi server example 2023-04-05 14:44:26 -04:00
Andrei Betlen
6de2f24aca Bump version 2023-04-05 06:53:43 -04:00
Andrei Betlen
e96a5c5722 Make Llama instance pickleable. Closes #27 2023-04-05 06:52:17 -04:00
Andrei Betlen
152e4695c3 Bump Version 2023-04-05 04:43:51 -04:00
Andrei Betlen
c177c807e5 Add supported python versions 2023-04-05 04:43:19 -04:00
Andrei Betlen
17fdd1547c Update workflow name and add badge to README 2023-04-05 04:41:24 -04:00
Andrei Betlen
7643f6677d Bugfix for Python3.7 2023-04-05 04:37:33 -04:00
Andrei Betlen
4d015c33bd Fix syntax error 2023-04-05 04:35:15 -04:00
Andrei Betlen
47570df17b Checkout submodules 2023-04-05 04:34:19 -04:00
Andrei Betlen
e3f999e732 Add missing scikit-build install 2023-04-05 04:31:38 -04:00
Andrei Betlen
43c20d3282 Add initial github action to run automated tests 2023-04-05 04:30:32 -04:00
Andrei Betlen
b1babcf56c Add quantize example 2023-04-05 04:17:26 -04:00
Andrei Betlen
c8e13a78d0 Re-organize examples folder 2023-04-05 04:10:13 -04:00
Andrei Betlen
c16bda5fb9 Add performance tuning notebook 2023-04-05 04:09:19 -04:00
Andrei Betlen
cefc69ea43 Add runtime check to ensure embedding is enabled if trying to generate embeddings 2023-04-05 03:25:37 -04:00
Andrei Betlen
5c50af7462 Remove workaround 2023-04-05 03:25:09 -04:00
Andrei Betlen
c3972b61ae Add basic tests. Closes #24 2023-04-05 03:23:15 -04:00
Andrei Betlen
51dbcf2693 Bugfix: wrong signature for quantize function 2023-04-04 22:36:59 -04:00
Andrei Betlen
8279fb7d92 Bump version 2023-04-04 17:17:11 -04:00
Andrei Betlen
c137789143 Add verbose flag. Closes #19 2023-04-04 13:09:24 -04:00
Andrei Betlen
5075c16fcc Bugfix: n_batch should always be <= n_ctx 2023-04-04 13:08:21 -04:00
Andrei Betlen
248b0566fa Update README 2023-04-04 10:57:22 -04:00
Andrei Betlen
ffe34cf64d Allow user to set llama config from env vars 2023-04-04 00:52:44 -04:00
Andrei Betlen
05eb2087d8 Small fixes for examples 2023-04-03 20:33:07 -04:00
Andrei Betlen
caf3c0362b Add return type for default __call__ method 2023-04-03 20:26:08 -04:00
Andrei Betlen
4aa349d777 Add docstring for create_chat_completion 2023-04-03 20:24:20 -04:00
Andrei Betlen
4615f1e520 Add chat completion method to docs 2023-04-03 20:14:03 -04:00
Andrei Betlen
5cf29d0231 Bump version 2023-04-03 20:13:46 -04:00
Andrei Betlen
7fedf16531 Add support for chat completion 2023-04-03 20:12:44 -04:00
Andrei Betlen
3dec778c90 Update to more sensible return signature 2023-04-03 20:12:14 -04:00
Andrei Betlen
f7ab8d55b2 Update context size defaults Close #11 2023-04-03 20:11:13 -04:00
Andrei Betlen
c0a5c0171f Add embed back into documentation 2023-04-03 18:53:00 -04:00
Andrei Betlen
adf656d542 Bump version 2023-04-03 18:46:49 -04:00