Andrei
c2e690b326
Merge pull request #29 from MillionthOdin16/main
...
Fixes and Tweaks to Defaults
2023-04-06 21:06:31 -04:00
Mug
10c7571117
Fixed too many newlines, now onto args.
...
Still needs shipping work so you could do "python -m llama_cpp.examples." etc.
2023-04-06 15:33:22 +02:00
Mug
085cc92b1f
Better llama.cpp interoperability
...
Has some too many newline issues so WIP
2023-04-06 15:30:57 +02:00
MillionthOdin16
2e91affea2
Ignore ./idea folder
2023-04-05 18:23:17 -04:00
MillionthOdin16
c283edd7f2
Set n_batch to default values and reduce thread count:
...
Change batch size to the llama.cpp default of 8. I've seen issues in llama.cpp where batch size affects quality of generations. (It shouldn't) But in case that's still an issue I changed to default.
Set auto-determined num of threads to 1/2 system count. ggml will sometimes lock cores at 100% while doing nothing. This is being addressed, but can cause bad experience for user if pegged at 100%
2023-04-05 18:17:29 -04:00
MillionthOdin16
b9b6dfd23f
Merge remote-tracking branch 'origin/main'
2023-04-05 17:51:43 -04:00
MillionthOdin16
76a82babef
Set n_batch to the default value of 8. I think this is leftover from when n_ctx was missing and n_batch was 2048.
2023-04-05 17:44:53 -04:00
Andrei Betlen
38f7dea6ca
Update README and docs
2023-04-05 17:44:25 -04:00
MillionthOdin16
1e90597983
Add pydantic dep. Errors if pedantic isn't present. Also throws errors relating to TypeDict or subclass() if the version is too old or new...
2023-04-05 17:37:06 -04:00
Andrei Betlen
267d3648fc
Bump version
2023-04-05 16:26:22 -04:00
Andrei Betlen
74bf043ddd
Update llama.cpp
2023-04-05 16:25:54 -04:00
Andrei Betlen
44448fb3a8
Add server as a subpackage
2023-04-05 16:23:25 -04:00
Andrei Betlen
e1b5b9bb04
Update fastapi server example
2023-04-05 14:44:26 -04:00
Mug
283e59c5e9
Fix bug in init_break not being set when exited via antiprompt and others.
2023-04-05 14:47:24 +02:00
Mug
99ceecfccd
Move to new examples directory
2023-04-05 14:28:02 +02:00
Mug
e3ea354547
Allow local llama library usage
2023-04-05 14:23:01 +02:00
Mug
e4c6f34d95
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python
2023-04-05 14:18:27 +02:00
Andrei Betlen
6de2f24aca
Bump version
2023-04-05 06:53:43 -04:00
Andrei Betlen
e96a5c5722
Make Llama instance pickleable. Closes #27
2023-04-05 06:52:17 -04:00
Andrei Betlen
152e4695c3
Bump Version
2023-04-05 04:43:51 -04:00
Andrei Betlen
c177c807e5
Add supported python versions
2023-04-05 04:43:19 -04:00
Andrei Betlen
17fdd1547c
Update workflow name and add badge to README
2023-04-05 04:41:24 -04:00
Andrei Betlen
7643f6677d
Bugfix for Python3.7
2023-04-05 04:37:33 -04:00
Andrei Betlen
4d015c33bd
Fix syntax error
2023-04-05 04:35:15 -04:00
Andrei Betlen
47570df17b
Checkout submodules
2023-04-05 04:34:19 -04:00
Andrei Betlen
e3f999e732
Add missing scikit-build install
2023-04-05 04:31:38 -04:00
Andrei Betlen
43c20d3282
Add initial github action to run automated tests
2023-04-05 04:30:32 -04:00
Andrei Betlen
b1babcf56c
Add quantize example
2023-04-05 04:17:26 -04:00
Andrei Betlen
c8e13a78d0
Re-organize examples folder
2023-04-05 04:10:13 -04:00
Andrei Betlen
c16bda5fb9
Add performance tuning notebook
2023-04-05 04:09:19 -04:00
Andrei Betlen
cefc69ea43
Add runtime check to ensure embedding is enabled if trying to generate embeddings
2023-04-05 03:25:37 -04:00
Andrei Betlen
5c50af7462
Remove workaround
2023-04-05 03:25:09 -04:00
Andrei Betlen
c3972b61ae
Add basic tests. Closes #24
2023-04-05 03:23:15 -04:00
Andrei Betlen
51dbcf2693
Bugfix: wrong signature for quantize function
2023-04-04 22:36:59 -04:00
Andrei Betlen
8279fb7d92
Bump version
2023-04-04 17:17:11 -04:00
Andrei Betlen
c137789143
Add verbose flag. Closes #19
2023-04-04 13:09:24 -04:00
Andrei Betlen
5075c16fcc
Bugfix: n_batch should always be <= n_ctx
2023-04-04 13:08:21 -04:00
Mug
c862e8bac5
Fix repeating instructions and an antiprompt bug
2023-04-04 17:54:47 +02:00
Andrei Betlen
248b0566fa
Update README
2023-04-04 10:57:22 -04:00
Mug
9cde7973cc
Fix stripping instruction prompt
2023-04-04 16:20:27 +02:00
Mug
da5a6a7089
Added instruction mode, fixed infinite generation, and various other fixes
2023-04-04 16:18:26 +02:00
Mug
0b32bb3d43
Add instruction mode
2023-04-04 11:48:48 +02:00
Andrei Betlen
ffe34cf64d
Allow user to set llama config from env vars
2023-04-04 00:52:44 -04:00
Andrei Betlen
05eb2087d8
Small fixes for examples
2023-04-03 20:33:07 -04:00
Andrei Betlen
caf3c0362b
Add return type for default __call__ method
2023-04-03 20:26:08 -04:00
Andrei Betlen
4aa349d777
Add docstring for create_chat_completion
2023-04-03 20:24:20 -04:00
Andrei Betlen
4615f1e520
Add chat completion method to docs
2023-04-03 20:14:03 -04:00
Andrei Betlen
5cf29d0231
Bump version
2023-04-03 20:13:46 -04:00
Andrei Betlen
7fedf16531
Add support for chat completion
2023-04-03 20:12:44 -04:00
Andrei Betlen
3dec778c90
Update to more sensible return signature
2023-04-03 20:12:14 -04:00