Commit graph

287 commits

Author SHA1 Message Date
jm12138
90e1021154 Add unlimited max_tokens 2023-04-10 15:56:05 +00:00
Andrei Betlen
1f67ad2a0b Add use_mmap option 2023-04-10 02:11:35 -04:00
Andrei Betlen
314ce7d1cc Fix cpu count default 2023-04-08 19:54:04 -04:00
Andrei Betlen
3fbc06361f Formatting 2023-04-08 16:01:45 -04:00
Andrei Betlen
e96a5c5722 Make Llama instance pickleable. Closes #27 2023-04-05 06:52:17 -04:00
Andrei Betlen
cefc69ea43 Add runtime check to ensure embedding is enabled if trying to generate embeddings 2023-04-05 03:25:37 -04:00
Andrei Betlen
5c50af7462 Remove workaround 2023-04-05 03:25:09 -04:00
Andrei Betlen
c137789143 Add verbose flag. Closes #19 2023-04-04 13:09:24 -04:00
Andrei Betlen
5075c16fcc Bugfix: n_batch should always be <= n_ctx 2023-04-04 13:08:21 -04:00
Andrei Betlen
caf3c0362b Add return type for default __call__ method 2023-04-03 20:26:08 -04:00
Andrei Betlen
4aa349d777 Add docstring for create_chat_completion 2023-04-03 20:24:20 -04:00
Andrei Betlen
7fedf16531 Add support for chat completion 2023-04-03 20:12:44 -04:00
Andrei Betlen
3dec778c90 Update to more sensible return signature 2023-04-03 20:12:14 -04:00
Andrei Betlen
ae004eb69e Fix #16 2023-04-03 18:46:19 -04:00
Andrei Betlen
4f509b963e Bugfix: Stop sequences and missing max_tokens check 2023-04-02 03:59:19 -04:00
Andrei Betlen
353e18a781 Move workaround to new sample method 2023-04-02 00:06:34 -04:00
Andrei Betlen
a4a1bbeaa9 Update api to allow for easier interactive mode 2023-04-02 00:02:47 -04:00
Andrei Betlen
eef627c09c Fix example documentation 2023-04-01 17:39:35 -04:00
Andrei Betlen
1e4346307c Add documentation for generate method 2023-04-01 17:36:30 -04:00
Andrei Betlen
67c70cc8eb Add static methods for beginning and end of sequence tokens. 2023-04-01 17:29:30 -04:00
Andrei Betlen
318eae237e Update high-level api 2023-04-01 13:01:27 -04:00
Andrei Betlen
70b8a1ef75 Add support to get embeddings from high-level api. Closes #4 2023-03-28 04:59:54 -04:00
Andrei Betlen
3dbb3fd3f6 Add support for stream parameter. Closes #1 2023-03-28 04:03:57 -04:00
Andrei Betlen
30fc0f3866 Extract generate method 2023-03-28 02:42:22 -04:00
Andrei Betlen
1c823f6d0f Refactor Llama class and add tokenize / detokenize methods Closes #3 2023-03-28 01:45:37 -04:00
Andrei Betlen
8ae3beda9c Update Llama to add params 2023-03-25 16:26:23 -04:00
Andrei Betlen
b121b7c05b Update docstring 2023-03-25 12:33:18 -04:00
Andrei Betlen
df15caa877 Add mkdocs 2023-03-24 18:57:59 -04:00
Andrei Betlen
b93675608a Handle errors returned by llama.cpp 2023-03-24 15:47:17 -04:00
Andrei Betlen
7786edb0f9 Black formatting 2023-03-24 14:59:29 -04:00
Andrei Betlen
b9c53b88a1 Use n_ctx provided from actual context not params 2023-03-24 14:58:10 -04:00
Andrei Betlen
2cc499512c Black formatting 2023-03-24 14:35:41 -04:00
Andrei Betlen
e24c581b5a Implement prompt batch processing as in main.cpp 2023-03-24 14:33:38 -04:00
Andrei Betlen
a28cb92d8f Remove model_name param 2023-03-24 04:04:29 -04:00
Andrei Betlen
eec9256a42 Bugfix: avoid decoding partial utf-8 characters 2023-03-23 16:25:13 -04:00
Andrei Betlen
e63ea4dbbc Add support for logprobs 2023-03-23 15:51:05 -04:00
Andrei Betlen
79b304c9d4 Initial commit 2023-03-23 05:33:06 -04:00