baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	1f67ad2a0b	Add use_mmap option	2023-04-10 02:11:35 -04:00
Andrei Betlen	314ce7d1cc	Fix cpu count default	2023-04-08 19:54:04 -04:00
Andrei Betlen	3fbc06361f	Formatting	2023-04-08 16:01:45 -04:00
Andrei Betlen	e96a5c5722	Make Llama instance pickleable. Closes #27	2023-04-05 06:52:17 -04:00
Andrei Betlen	cefc69ea43	Add runtime check to ensure embedding is enabled if trying to generate embeddings	2023-04-05 03:25:37 -04:00
Andrei Betlen	5c50af7462	Remove workaround	2023-04-05 03:25:09 -04:00
Andrei Betlen	c137789143	Add verbose flag. Closes #19	2023-04-04 13:09:24 -04:00
Andrei Betlen	5075c16fcc	Bugfix: n_batch should always be <= n_ctx	2023-04-04 13:08:21 -04:00
Andrei Betlen	caf3c0362b	Add return type for default __call__ method	2023-04-03 20:26:08 -04:00
Andrei Betlen	4aa349d777	Add docstring for create_chat_completion	2023-04-03 20:24:20 -04:00
Andrei Betlen	7fedf16531	Add support for chat completion	2023-04-03 20:12:44 -04:00
Andrei Betlen	3dec778c90	Update to more sensible return signature	2023-04-03 20:12:14 -04:00
Andrei Betlen	ae004eb69e	Fix #16	2023-04-03 18:46:19 -04:00
Andrei Betlen	4f509b963e	Bugfix: Stop sequences and missing max_tokens check	2023-04-02 03:59:19 -04:00
Andrei Betlen	353e18a781	Move workaround to new sample method	2023-04-02 00:06:34 -04:00
Andrei Betlen	a4a1bbeaa9	Update api to allow for easier interactive mode	2023-04-02 00:02:47 -04:00
Andrei Betlen	eef627c09c	Fix example documentation	2023-04-01 17:39:35 -04:00
Andrei Betlen	1e4346307c	Add documentation for generate method	2023-04-01 17:36:30 -04:00
Andrei Betlen	67c70cc8eb	Add static methods for beginning and end of sequence tokens.	2023-04-01 17:29:30 -04:00
Andrei Betlen	318eae237e	Update high-level api	2023-04-01 13:01:27 -04:00
Andrei Betlen	70b8a1ef75	Add support to get embeddings from high-level api. Closes #4	2023-03-28 04:59:54 -04:00
Andrei Betlen	3dbb3fd3f6	Add support for stream parameter. Closes #1	2023-03-28 04:03:57 -04:00
Andrei Betlen	30fc0f3866	Extract generate method	2023-03-28 02:42:22 -04:00
Andrei Betlen	1c823f6d0f	Refactor Llama class and add tokenize / detokenize methods Closes #3	2023-03-28 01:45:37 -04:00
Andrei Betlen	8ae3beda9c	Update Llama to add params	2023-03-25 16:26:23 -04:00
Andrei Betlen	b121b7c05b	Update docstring	2023-03-25 12:33:18 -04:00
Andrei Betlen	df15caa877	Add mkdocs	2023-03-24 18:57:59 -04:00
Andrei Betlen	b93675608a	Handle errors returned by llama.cpp	2023-03-24 15:47:17 -04:00
Andrei Betlen	7786edb0f9	Black formatting	2023-03-24 14:59:29 -04:00
Andrei Betlen	b9c53b88a1	Use n_ctx provided from actual context not params	2023-03-24 14:58:10 -04:00
Andrei Betlen	2cc499512c	Black formatting	2023-03-24 14:35:41 -04:00
Andrei Betlen	e24c581b5a	Implement prompt batch processing as in main.cpp	2023-03-24 14:33:38 -04:00
Andrei Betlen	a28cb92d8f	Remove model_name param	2023-03-24 04:04:29 -04:00
Andrei Betlen	eec9256a42	Bugfix: avoid decoding partial utf-8 characters	2023-03-23 16:25:13 -04:00
Andrei Betlen	e63ea4dbbc	Add support for logprobs	2023-03-23 15:51:05 -04:00
Andrei Betlen	79b304c9d4	Initial commit	2023-03-23 05:33:06 -04:00

36 commits