baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	d75196d7a1	Install with pip during build step Use setup.py install Upgrade version of setuptools Revert to develop Use setup.py build and pip install Just use pip install Use correct name in pyproject.toml Make pip install verbose	2023-04-06 22:21:45 -04:00
Andrei Betlen	dd1c298620	Fix typo	2023-04-06 21:28:03 -04:00
Andrei Betlen	baa825dacb	Add windows and mac runners	2023-04-06 21:27:01 -04:00
Andrei Betlen	da539cc2ee	Safer calculation of default n_threads	2023-04-06 21:22:19 -04:00
Andrei Betlen	9b7526895d	Bump version	2023-04-06 21:19:08 -04:00
Andrei Betlen	7851cc1e3c	Don't install pydantic by default	2023-04-06 21:10:34 -04:00
Andrei Betlen	09707f5b2a	Remove console script	2023-04-06 21:08:32 -04:00
Andrei Betlen	930db37dd2	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-04-06 21:07:38 -04:00
Andrei Betlen	55279b679d	Handle prompt list	2023-04-06 21:07:35 -04:00
Andrei	c2e690b326	Merge pull request #29 from MillionthOdin16/main Fixes and Tweaks to Defaults	2023-04-06 21:06:31 -04:00
MillionthOdin16	2e91affea2	Ignore ./idea folder	2023-04-05 18:23:17 -04:00
MillionthOdin16	c283edd7f2	Set n_batch to default values and reduce thread count: Change batch size to the llama.cpp default of 8. I've seen issues in llama.cpp where batch size affects quality of generations. (It shouldn't) But in case that's still an issue I changed to default. Set auto-determined num of threads to 1/2 system count. ggml will sometimes lock cores at 100% while doing nothing. This is being addressed, but can cause bad experience for user if pegged at 100%	2023-04-05 18:17:29 -04:00
MillionthOdin16	b9b6dfd23f	Merge remote-tracking branch 'origin/main'	2023-04-05 17:51:43 -04:00
MillionthOdin16	76a82babef	Set n_batch to the default value of 8. I think this is leftover from when n_ctx was missing and n_batch was 2048.	2023-04-05 17:44:53 -04:00
Andrei Betlen	38f7dea6ca	Update README and docs	2023-04-05 17:44:25 -04:00
MillionthOdin16	1e90597983	Add pydantic dep. Errors if pedantic isn't present. Also throws errors relating to TypeDict or subclass() if the version is too old or new...	2023-04-05 17:37:06 -04:00
Andrei Betlen	267d3648fc	Bump version	2023-04-05 16:26:22 -04:00
Andrei Betlen	74bf043ddd	Update llama.cpp	2023-04-05 16:25:54 -04:00
Andrei Betlen	44448fb3a8	Add server as a subpackage	2023-04-05 16:23:25 -04:00
Andrei Betlen	e1b5b9bb04	Update fastapi server example	2023-04-05 14:44:26 -04:00
Andrei Betlen	6de2f24aca	Bump version	2023-04-05 06:53:43 -04:00
Andrei Betlen	e96a5c5722	Make Llama instance pickleable. Closes #27	2023-04-05 06:52:17 -04:00
Andrei Betlen	152e4695c3	Bump Version	2023-04-05 04:43:51 -04:00
Andrei Betlen	c177c807e5	Add supported python versions	2023-04-05 04:43:19 -04:00
Andrei Betlen	17fdd1547c	Update workflow name and add badge to README	2023-04-05 04:41:24 -04:00
Andrei Betlen	7643f6677d	Bugfix for Python3.7	2023-04-05 04:37:33 -04:00
Andrei Betlen	4d015c33bd	Fix syntax error	2023-04-05 04:35:15 -04:00
Andrei Betlen	47570df17b	Checkout submodules	2023-04-05 04:34:19 -04:00
Andrei Betlen	e3f999e732	Add missing scikit-build install	2023-04-05 04:31:38 -04:00
Andrei Betlen	43c20d3282	Add initial github action to run automated tests	2023-04-05 04:30:32 -04:00
Andrei Betlen	b1babcf56c	Add quantize example	2023-04-05 04:17:26 -04:00
Andrei Betlen	c8e13a78d0	Re-organize examples folder	2023-04-05 04:10:13 -04:00
Andrei Betlen	c16bda5fb9	Add performance tuning notebook	2023-04-05 04:09:19 -04:00
Andrei Betlen	cefc69ea43	Add runtime check to ensure embedding is enabled if trying to generate embeddings	2023-04-05 03:25:37 -04:00
Andrei Betlen	5c50af7462	Remove workaround	2023-04-05 03:25:09 -04:00
Andrei Betlen	c3972b61ae	Add basic tests. Closes #24	2023-04-05 03:23:15 -04:00
Andrei Betlen	51dbcf2693	Bugfix: wrong signature for quantize function	2023-04-04 22:36:59 -04:00
Andrei Betlen	8279fb7d92	Bump version	2023-04-04 17:17:11 -04:00
Andrei Betlen	c137789143	Add verbose flag. Closes #19	2023-04-04 13:09:24 -04:00
Andrei Betlen	5075c16fcc	Bugfix: n_batch should always be <= n_ctx	2023-04-04 13:08:21 -04:00
Andrei Betlen	248b0566fa	Update README	2023-04-04 10:57:22 -04:00
Andrei Betlen	ffe34cf64d	Allow user to set llama config from env vars	2023-04-04 00:52:44 -04:00
Andrei Betlen	05eb2087d8	Small fixes for examples	2023-04-03 20:33:07 -04:00
Andrei Betlen	caf3c0362b	Add return type for default __call__ method	2023-04-03 20:26:08 -04:00
Andrei Betlen	4aa349d777	Add docstring for create_chat_completion	2023-04-03 20:24:20 -04:00
Andrei Betlen	4615f1e520	Add chat completion method to docs	2023-04-03 20:14:03 -04:00
Andrei Betlen	5cf29d0231	Bump version	2023-04-03 20:13:46 -04:00
Andrei Betlen	7fedf16531	Add support for chat completion	2023-04-03 20:12:44 -04:00
Andrei Betlen	3dec778c90	Update to more sensible return signature	2023-04-03 20:12:14 -04:00
Andrei Betlen	f7ab8d55b2	Update context size defaults Close #11	2023-04-03 20:11:13 -04:00

1 2 3

147 commits