baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	baa825dacb	Add windows and mac runners	2023-04-06 21:27:01 -04:00
Andrei Betlen	da539cc2ee	Safer calculation of default n_threads	2023-04-06 21:22:19 -04:00
Andrei Betlen	9b7526895d	Bump version	2023-04-06 21:19:08 -04:00
Andrei Betlen	7851cc1e3c	Don't install pydantic by default	2023-04-06 21:10:34 -04:00
Andrei Betlen	09707f5b2a	Remove console script	2023-04-06 21:08:32 -04:00
Andrei Betlen	930db37dd2	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-04-06 21:07:38 -04:00
Andrei Betlen	55279b679d	Handle prompt list	2023-04-06 21:07:35 -04:00
Andrei	c2e690b326	Merge pull request #29 from MillionthOdin16/main Fixes and Tweaks to Defaults	2023-04-06 21:06:31 -04:00
MillionthOdin16	2e91affea2	Ignore ./idea folder	2023-04-05 18:23:17 -04:00
MillionthOdin16	c283edd7f2	Set n_batch to default values and reduce thread count: Change batch size to the llama.cpp default of 8. I've seen issues in llama.cpp where batch size affects quality of generations. (It shouldn't) But in case that's still an issue I changed to default. Set auto-determined num of threads to 1/2 system count. ggml will sometimes lock cores at 100% while doing nothing. This is being addressed, but can cause bad experience for user if pegged at 100%	2023-04-05 18:17:29 -04:00
MillionthOdin16	b9b6dfd23f	Merge remote-tracking branch 'origin/main'	2023-04-05 17:51:43 -04:00
MillionthOdin16	76a82babef	Set n_batch to the default value of 8. I think this is leftover from when n_ctx was missing and n_batch was 2048.	2023-04-05 17:44:53 -04:00
Andrei Betlen	38f7dea6ca	Update README and docs	2023-04-05 17:44:25 -04:00
MillionthOdin16	1e90597983	Add pydantic dep. Errors if pedantic isn't present. Also throws errors relating to TypeDict or subclass() if the version is too old or new...	2023-04-05 17:37:06 -04:00
Andrei Betlen	267d3648fc	Bump version	2023-04-05 16:26:22 -04:00
Andrei Betlen	74bf043ddd	Update llama.cpp	2023-04-05 16:25:54 -04:00
Andrei Betlen	44448fb3a8	Add server as a subpackage	2023-04-05 16:23:25 -04:00
Andrei Betlen	e1b5b9bb04	Update fastapi server example	2023-04-05 14:44:26 -04:00
Andrei Betlen	6de2f24aca	Bump version	2023-04-05 06:53:43 -04:00
Andrei Betlen	e96a5c5722	Make Llama instance pickleable. Closes #27	2023-04-05 06:52:17 -04:00
Andrei Betlen	152e4695c3	Bump Version	2023-04-05 04:43:51 -04:00
Andrei Betlen	c177c807e5	Add supported python versions	2023-04-05 04:43:19 -04:00
Andrei Betlen	17fdd1547c	Update workflow name and add badge to README	2023-04-05 04:41:24 -04:00
Andrei Betlen	7643f6677d	Bugfix for Python3.7	2023-04-05 04:37:33 -04:00
Andrei Betlen	4d015c33bd	Fix syntax error	2023-04-05 04:35:15 -04:00
Andrei Betlen	47570df17b	Checkout submodules	2023-04-05 04:34:19 -04:00
Andrei Betlen	e3f999e732	Add missing scikit-build install	2023-04-05 04:31:38 -04:00
Andrei Betlen	43c20d3282	Add initial github action to run automated tests	2023-04-05 04:30:32 -04:00
Andrei Betlen	b1babcf56c	Add quantize example	2023-04-05 04:17:26 -04:00
Andrei Betlen	c8e13a78d0	Re-organize examples folder	2023-04-05 04:10:13 -04:00
Andrei Betlen	c16bda5fb9	Add performance tuning notebook	2023-04-05 04:09:19 -04:00
Andrei Betlen	cefc69ea43	Add runtime check to ensure embedding is enabled if trying to generate embeddings	2023-04-05 03:25:37 -04:00
Andrei Betlen	5c50af7462	Remove workaround	2023-04-05 03:25:09 -04:00
Andrei Betlen	c3972b61ae	Add basic tests. Closes #24	2023-04-05 03:23:15 -04:00
Andrei Betlen	51dbcf2693	Bugfix: wrong signature for quantize function	2023-04-04 22:36:59 -04:00
Andrei Betlen	8279fb7d92	Bump version	2023-04-04 17:17:11 -04:00
Andrei Betlen	c137789143	Add verbose flag. Closes #19	2023-04-04 13:09:24 -04:00
Andrei Betlen	5075c16fcc	Bugfix: n_batch should always be <= n_ctx	2023-04-04 13:08:21 -04:00
Andrei Betlen	248b0566fa	Update README	2023-04-04 10:57:22 -04:00
Andrei Betlen	ffe34cf64d	Allow user to set llama config from env vars	2023-04-04 00:52:44 -04:00
Andrei Betlen	05eb2087d8	Small fixes for examples	2023-04-03 20:33:07 -04:00
Andrei Betlen	caf3c0362b	Add return type for default __call__ method	2023-04-03 20:26:08 -04:00
Andrei Betlen	4aa349d777	Add docstring for create_chat_completion	2023-04-03 20:24:20 -04:00
Andrei Betlen	4615f1e520	Add chat completion method to docs	2023-04-03 20:14:03 -04:00
Andrei Betlen	5cf29d0231	Bump version	2023-04-03 20:13:46 -04:00
Andrei Betlen	7fedf16531	Add support for chat completion	2023-04-03 20:12:44 -04:00
Andrei Betlen	3dec778c90	Update to more sensible return signature	2023-04-03 20:12:14 -04:00
Andrei Betlen	f7ab8d55b2	Update context size defaults Close #11	2023-04-03 20:11:13 -04:00
Andrei Betlen	c0a5c0171f	Add embed back into documentation	2023-04-03 18:53:00 -04:00
Andrei Betlen	adf656d542	Bump version	2023-04-03 18:46:49 -04:00

1 2 3

145 commits