Commit graph

702 commits

Author SHA1 Message Date
Andrei Betlen
a7281994d8 chore: Bump version 2024-03-08 21:14:44 -05:00
Andrei Betlen
919fca9f2b Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main 2024-03-08 21:10:56 -05:00
Andrei Betlen
d02a9cf16f Fixed json strings grammar by blacklisting character control set. Closes #1259 2024-03-08 21:10:53 -05:00
Felipe Lorenz
c139f8b5d5
feat: Add endpoints for tokenize, detokenize and count tokens (#1136)
* Add endpoint to count tokens

* Add tokenize and detokenize endpoints

* Change response key to tokens for tokenize endpoint

* Fix dependency bug

* Cleanup

* Remove example added by mistake

* Move tokenize, detokenize, and count to Extras namespace. Tag existing endpoints

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-03-08 21:09:00 -05:00
Kevin Cao
1f3156d4f2
fix: Check for existence of clip model path (#1264) 2024-03-08 21:00:10 -05:00
Douglas Hanley
2811014bae
feat: Switch embed to llama_get_embeddings_seq (#1263)
* switch to llama_get_embeddings_seq

* Remove duplicate definition of llama_get_embeddings_seq

Co-authored-by: Andrei <abetlen@gmail.com>

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-03-08 20:59:35 -05:00
Andrei Betlen
40c6b54f68 feat: Update llama.cpp 2024-03-08 20:58:50 -05:00
Andrei Betlen
93dc56ace8 Update llama.cpp 2024-03-06 01:32:00 -05:00
Andrei Betlen
87a6e5797e feat: Update llama.cpp 2024-03-03 11:27:04 -05:00
Andrei Betlen
13177aae0f chore: Bump version 2024-03-02 22:46:40 -05:00
Andrei Betlen
0e70984fb6 feat: Update llama.cpp 2024-03-02 22:20:04 -05:00
Andrei Betlen
d5df431278 chore: Bump version 2024-03-01 13:15:16 -05:00
Andrei Betlen
97aa3a153d docs: Add information re: auto chat formats. Closes #1236 2024-03-01 13:10:25 -05:00
Andrei Betlen
f062a7f51d feat: Update llama.cpp 2024-03-01 12:57:16 -05:00
Andrei Betlen
8c71725d53 fix: Remove deprecated cfg sampling functions 2024-02-28 14:37:07 -05:00
Andrei Betlen
727d60c28a misc: Format 2024-02-28 14:27:40 -05:00
Andrei Betlen
0d37ce52b1 feat: Update llama.cpp 2024-02-28 14:27:16 -05:00
Andrei Betlen
ffcd4b2636 chore: Bump version 2024-02-28 01:38:32 -05:00
Sigbjørn Skjæret
c36ab15e68
fix: eos/bos_token set correctly for Jinja2ChatFormatter and automatic chat formatter (#1230)
The token strings were not correctly retrieved (empty).
2024-02-28 01:30:31 -05:00
Andrei Betlen
fea33c9b94 feat: Update llama.cpp 2024-02-27 12:22:17 -05:00
Andrei
4d574bd765
feat(server): Add support for pulling models from Huggingface Hub (#1222)
* Basic support for hf pull on server

* Add hf_model_repo_id setting

* Update README
2024-02-26 14:35:08 -05:00
Andrei Betlen
afe1e445c9 chore: Bump version 2024-02-26 11:43:24 -05:00
Andrei Betlen
9558ce7878 feat: Update llama.cpp 2024-02-26 11:40:58 -05:00
Andrei Betlen
dbaba3059d fix: positional arguments only for low-level api 2024-02-26 11:31:11 -05:00
Andrei Betlen
78e536dcfe fix: typo 2024-02-26 11:14:26 -05:00
Andrei Betlen
44558cbd7a misc: llava_cpp use ctypes function decorator for binding 2024-02-26 11:07:33 -05:00
Andrei Betlen
8383a9e562 fix: llava this function takes at least 4 arguments (0 given) 2024-02-26 11:03:20 -05:00
Andrei Betlen
8e03fd9957 chore: Bump version 2024-02-25 21:15:42 -05:00
Andrei Betlen
dcf38f6141 fix: remove prematurely commited change 2024-02-25 21:00:37 -05:00
Andrei Betlen
cbbcd888af feat: Update llama.cpp 2024-02-25 20:52:14 -05:00
Andrei Betlen
19234aa0db fix: Restore type hints for low-level api 2024-02-25 16:54:37 -05:00
Andrei Betlen
2292af5796 feat: Update llama.cpp 2024-02-25 16:53:58 -05:00
Andrei Betlen
221edb9ef1 feat: Update llama.cpp 2024-02-24 23:47:29 -05:00
Andrei Betlen
20ea6fd7d6 chore: Bump version 2024-02-23 12:38:36 -05:00
Andrei Betlen
47bad30dd7 fix: LlamaHFTokenizer now receives pre_tokens 2024-02-23 12:23:24 -05:00
Andrei Betlen
ded5d627a5 chore: Bump version 2024-02-23 11:32:43 -05:00
Luke Stanley
858496224e
feat: Auto detect Mixtral's slightly different format (#1214) 2024-02-23 11:27:38 -05:00
Andrei Betlen
db776a885c fix: module 'llama_cpp.llama_cpp' has no attribute 'c_uint8' 2024-02-23 11:24:53 -05:00
Andrei Betlen
427d816ebf chore: Bump version 2024-02-23 04:54:08 -05:00
Alvaro Bartolome
251a8a2cad
feat: Add Google's Gemma formatting via chat_format="gemma" (#1210)
* Add Google's Gemma formatting via `chat_format="gemma"`

* Replace `raise ValueError` with `logger.debug`

Co-authored-by: Andrei <abetlen@gmail.com>

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-23 04:40:52 -05:00
Andrei Betlen
b9aca612af misc: use typesafe byref for internal classes 2024-02-23 03:40:07 -05:00
Andrei Betlen
a0ce429dc0 misc: use decorator to bind low level api functions, fixes docs 2024-02-23 03:39:38 -05:00
Andrei Betlen
e10af30cf1 fix: TypeAlias import error 2024-02-22 03:27:28 -05:00
Andrei Betlen
3561ebf536 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main 2024-02-22 03:25:13 -05:00
Andrei Betlen
aefcb8f71a misc: additional type annotations for low level api 2024-02-22 02:00:09 -05:00
Andrei Betlen
3921e10770 feat: support minItems/maxItems in JSON grammar converter (by @nopperl) 2024-02-22 00:17:06 -05:00
Andrei Betlen
e6d6260a91 fix: Update from_pretrained defaults to match hf_hub_download 2024-02-22 00:10:23 -05:00
Andrei Betlen
dd22010e85 fix: Raise exceptions when llama model or context fails to load 2024-02-22 00:09:45 -05:00
Andrei Betlen
3632241e98 chore: Bump version 2024-02-21 23:09:13 -05:00
Andrei Betlen
0653e15c20 feat: Update llama.cpp 2024-02-21 23:04:52 -05:00