Andrei Betlen
727d60c28a
misc: Format
2024-02-28 14:27:40 -05:00
Andrei Betlen
0d37ce52b1
feat: Update llama.cpp
2024-02-28 14:27:16 -05:00
Andrei Betlen
ffcd4b2636
chore: Bump version
2024-02-28 01:38:32 -05:00
Sigbjørn Skjæret
c36ab15e68
fix: eos/bos_token set correctly for Jinja2ChatFormatter and automatic chat formatter ( #1230 )
...
The token strings were not correctly retrieved (empty).
2024-02-28 01:30:31 -05:00
Andrei Betlen
fea33c9b94
feat: Update llama.cpp
2024-02-27 12:22:17 -05:00
Andrei
4d574bd765
feat(server): Add support for pulling models from Huggingface Hub ( #1222 )
...
* Basic support for hf pull on server
* Add hf_model_repo_id setting
* Update README
2024-02-26 14:35:08 -05:00
Andrei Betlen
afe1e445c9
chore: Bump version
2024-02-26 11:43:24 -05:00
Andrei Betlen
9558ce7878
feat: Update llama.cpp
2024-02-26 11:40:58 -05:00
Andrei Betlen
dbaba3059d
fix: positional arguments only for low-level api
2024-02-26 11:31:11 -05:00
Andrei Betlen
78e536dcfe
fix: typo
2024-02-26 11:14:26 -05:00
Andrei Betlen
44558cbd7a
misc: llava_cpp use ctypes function decorator for binding
2024-02-26 11:07:33 -05:00
Andrei Betlen
8383a9e562
fix: llava this function takes at least 4 arguments (0 given)
2024-02-26 11:03:20 -05:00
Andrei Betlen
8e03fd9957
chore: Bump version
2024-02-25 21:15:42 -05:00
Andrei Betlen
dcf38f6141
fix: remove prematurely commited change
2024-02-25 21:00:37 -05:00
Andrei Betlen
cbbcd888af
feat: Update llama.cpp
2024-02-25 20:52:14 -05:00
Andrei Betlen
19234aa0db
fix: Restore type hints for low-level api
2024-02-25 16:54:37 -05:00
Andrei Betlen
2292af5796
feat: Update llama.cpp
2024-02-25 16:53:58 -05:00
Andrei Betlen
221edb9ef1
feat: Update llama.cpp
2024-02-24 23:47:29 -05:00
Andrei Betlen
20ea6fd7d6
chore: Bump version
2024-02-23 12:38:36 -05:00
Andrei Betlen
47bad30dd7
fix: LlamaHFTokenizer now receives pre_tokens
2024-02-23 12:23:24 -05:00
Andrei Betlen
ded5d627a5
chore: Bump version
2024-02-23 11:32:43 -05:00
Luke Stanley
858496224e
feat: Auto detect Mixtral's slightly different format ( #1214 )
2024-02-23 11:27:38 -05:00
Andrei Betlen
db776a885c
fix: module 'llama_cpp.llama_cpp' has no attribute 'c_uint8'
2024-02-23 11:24:53 -05:00
Andrei Betlen
427d816ebf
chore: Bump version
2024-02-23 04:54:08 -05:00
Alvaro Bartolome
251a8a2cad
feat: Add Google's Gemma formatting via chat_format="gemma"
( #1210 )
...
* Add Google's Gemma formatting via `chat_format="gemma"`
* Replace `raise ValueError` with `logger.debug`
Co-authored-by: Andrei <abetlen@gmail.com>
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-23 04:40:52 -05:00
Andrei Betlen
b9aca612af
misc: use typesafe byref for internal classes
2024-02-23 03:40:07 -05:00
Andrei Betlen
a0ce429dc0
misc: use decorator to bind low level api functions, fixes docs
2024-02-23 03:39:38 -05:00
Andrei Betlen
e10af30cf1
fix: TypeAlias import error
2024-02-22 03:27:28 -05:00
Andrei Betlen
3561ebf536
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main
2024-02-22 03:25:13 -05:00
Andrei Betlen
aefcb8f71a
misc: additional type annotations for low level api
2024-02-22 02:00:09 -05:00
Andrei Betlen
3921e10770
feat: support minItems/maxItems in JSON grammar converter (by @nopperl)
2024-02-22 00:17:06 -05:00
Andrei Betlen
e6d6260a91
fix: Update from_pretrained defaults to match hf_hub_download
2024-02-22 00:10:23 -05:00
Andrei Betlen
dd22010e85
fix: Raise exceptions when llama model or context fails to load
2024-02-22 00:09:45 -05:00
Andrei Betlen
3632241e98
chore: Bump version
2024-02-21 23:09:13 -05:00
Andrei Betlen
0653e15c20
feat: Update llama.cpp
2024-02-21 23:04:52 -05:00
Andrei Betlen
7981e9ce1e
chore: Bump version
2024-02-21 16:30:59 -05:00
Andrei
7f51b6071f
feat(low-level-api): Improve API static type-safety and performance ( #1205 )
2024-02-21 16:25:38 -05:00
Andrei
0f8aa4ab5c
feat: Pull models directly from huggingface ( #1206 )
...
* Add from_pretrained method to Llama class
* Update docs
* Merge filename and pattern
2024-02-21 16:25:10 -05:00
Andrei Betlen
e42f62c247
chore: Bump version
2024-02-21 11:09:40 -05:00
Andrei Betlen
4edde21b3d
feat: Update llama.cpp
2024-02-21 11:05:58 -05:00
Andrei Betlen
6225f027e5
feat: Update llama.cpp
2024-02-19 04:11:34 -05:00
Andrei Betlen
748c0ce057
feat: Update llama.cpp
2024-02-18 21:30:36 -05:00
Andrei Betlen
53f6f5f415
fix: self.numa missing
2024-02-17 01:02:33 -05:00
Andrei Betlen
fdce078cb9
feat: Update llama.cpp
2024-02-17 00:37:51 -05:00
Andrei Betlen
f736827b9b
chore: Bump version
2024-02-15 23:10:50 -05:00
Andrei Betlen
0ce66bc080
fix: create_embedding broken response for input type str
2024-02-15 16:09:48 -05:00
khimaros
ea1f88dd29
fix: Use '\n' seperator for EventSourceResponse ( #1188 )
...
this fixes compatibility with some OpenAI clients, including BetterChatGPT (https://github.com/ztjhz/BetterChatGPT/issues/537 ).
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-15 15:20:13 -05:00
Andrei Betlen
a5cfeb7763
feat: Update llama.cpp
2024-02-15 15:17:30 -05:00
Douglas Hanley
7bb91f025f
fix: Incorporate embedding pooling layer fixes ( #1194 )
...
* remove division by token count
* truncate to n_batch, not n_ctx
2024-02-15 15:16:30 -05:00
Andrei Betlen
ae71ad1a14
Bump version
2024-02-14 04:31:42 -05:00