Andrei Betlen
d75196d7a1
Install with pip during build step
...
Use setup.py install
Upgrade version of setuptools
Revert to develop
Use setup.py build and pip install
Just use pip install
Use correct name in pyproject.toml
Make pip install verbose
2023-04-06 22:21:45 -04:00
Andrei Betlen
dd1c298620
Fix typo
2023-04-06 21:28:03 -04:00
Andrei Betlen
baa825dacb
Add windows and mac runners
2023-04-06 21:27:01 -04:00
Andrei Betlen
da539cc2ee
Safer calculation of default n_threads
2023-04-06 21:22:19 -04:00
Andrei Betlen
9b7526895d
Bump version
2023-04-06 21:19:08 -04:00
Andrei Betlen
7851cc1e3c
Don't install pydantic by default
2023-04-06 21:10:34 -04:00
Andrei Betlen
09707f5b2a
Remove console script
2023-04-06 21:08:32 -04:00
Andrei Betlen
930db37dd2
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
2023-04-06 21:07:38 -04:00
Andrei Betlen
55279b679d
Handle prompt list
2023-04-06 21:07:35 -04:00
Andrei
c2e690b326
Merge pull request #29 from MillionthOdin16/main
...
Fixes and Tweaks to Defaults
2023-04-06 21:06:31 -04:00
MillionthOdin16
2e91affea2
Ignore ./idea folder
2023-04-05 18:23:17 -04:00
MillionthOdin16
c283edd7f2
Set n_batch to default values and reduce thread count:
...
Change batch size to the llama.cpp default of 8. I've seen issues in llama.cpp where batch size affects quality of generations. (It shouldn't) But in case that's still an issue I changed to default.
Set auto-determined num of threads to 1/2 system count. ggml will sometimes lock cores at 100% while doing nothing. This is being addressed, but can cause bad experience for user if pegged at 100%
2023-04-05 18:17:29 -04:00
MillionthOdin16
b9b6dfd23f
Merge remote-tracking branch 'origin/main'
2023-04-05 17:51:43 -04:00
MillionthOdin16
76a82babef
Set n_batch to the default value of 8. I think this is leftover from when n_ctx was missing and n_batch was 2048.
2023-04-05 17:44:53 -04:00
Andrei Betlen
38f7dea6ca
Update README and docs
2023-04-05 17:44:25 -04:00
MillionthOdin16
1e90597983
Add pydantic dep. Errors if pedantic isn't present. Also throws errors relating to TypeDict or subclass() if the version is too old or new...
2023-04-05 17:37:06 -04:00
Andrei Betlen
267d3648fc
Bump version
2023-04-05 16:26:22 -04:00
Andrei Betlen
74bf043ddd
Update llama.cpp
2023-04-05 16:25:54 -04:00
Andrei Betlen
44448fb3a8
Add server as a subpackage
2023-04-05 16:23:25 -04:00
Andrei Betlen
e1b5b9bb04
Update fastapi server example
2023-04-05 14:44:26 -04:00
Andrei Betlen
6de2f24aca
Bump version
2023-04-05 06:53:43 -04:00
Andrei Betlen
e96a5c5722
Make Llama instance pickleable. Closes #27
2023-04-05 06:52:17 -04:00
Andrei Betlen
152e4695c3
Bump Version
2023-04-05 04:43:51 -04:00
Andrei Betlen
c177c807e5
Add supported python versions
2023-04-05 04:43:19 -04:00
Andrei Betlen
17fdd1547c
Update workflow name and add badge to README
2023-04-05 04:41:24 -04:00
Andrei Betlen
7643f6677d
Bugfix for Python3.7
2023-04-05 04:37:33 -04:00
Andrei Betlen
4d015c33bd
Fix syntax error
2023-04-05 04:35:15 -04:00
Andrei Betlen
47570df17b
Checkout submodules
2023-04-05 04:34:19 -04:00
Andrei Betlen
e3f999e732
Add missing scikit-build install
2023-04-05 04:31:38 -04:00
Andrei Betlen
43c20d3282
Add initial github action to run automated tests
2023-04-05 04:30:32 -04:00
Andrei Betlen
b1babcf56c
Add quantize example
2023-04-05 04:17:26 -04:00
Andrei Betlen
c8e13a78d0
Re-organize examples folder
2023-04-05 04:10:13 -04:00
Andrei Betlen
c16bda5fb9
Add performance tuning notebook
2023-04-05 04:09:19 -04:00
Andrei Betlen
cefc69ea43
Add runtime check to ensure embedding is enabled if trying to generate embeddings
2023-04-05 03:25:37 -04:00
Andrei Betlen
5c50af7462
Remove workaround
2023-04-05 03:25:09 -04:00
Andrei Betlen
c3972b61ae
Add basic tests. Closes #24
2023-04-05 03:23:15 -04:00
Andrei Betlen
51dbcf2693
Bugfix: wrong signature for quantize function
2023-04-04 22:36:59 -04:00
Andrei Betlen
8279fb7d92
Bump version
2023-04-04 17:17:11 -04:00
Andrei Betlen
c137789143
Add verbose flag. Closes #19
2023-04-04 13:09:24 -04:00
Andrei Betlen
5075c16fcc
Bugfix: n_batch should always be <= n_ctx
2023-04-04 13:08:21 -04:00
Andrei Betlen
248b0566fa
Update README
2023-04-04 10:57:22 -04:00
Andrei Betlen
ffe34cf64d
Allow user to set llama config from env vars
2023-04-04 00:52:44 -04:00
Andrei Betlen
05eb2087d8
Small fixes for examples
2023-04-03 20:33:07 -04:00
Andrei Betlen
caf3c0362b
Add return type for default __call__ method
2023-04-03 20:26:08 -04:00
Andrei Betlen
4aa349d777
Add docstring for create_chat_completion
2023-04-03 20:24:20 -04:00
Andrei Betlen
4615f1e520
Add chat completion method to docs
2023-04-03 20:14:03 -04:00
Andrei Betlen
5cf29d0231
Bump version
2023-04-03 20:13:46 -04:00
Andrei Betlen
7fedf16531
Add support for chat completion
2023-04-03 20:12:44 -04:00
Andrei Betlen
3dec778c90
Update to more sensible return signature
2023-04-03 20:12:14 -04:00
Andrei Betlen
f7ab8d55b2
Update context size defaults Close #11
2023-04-03 20:11:13 -04:00