Andrei Betlen
a0ce429dc0
misc: use decorator to bind low level api functions, fixes docs
2024-02-23 03:39:38 -05:00
Andrei Betlen
410e02da51
docs: Fix typo
2024-02-23 00:43:31 -05:00
Andrei Betlen
eb56ce2e2a
docs: fix low-level api example
2024-02-22 11:33:05 -05:00
Andrei Betlen
0f8cad6cb7
docs: Update README
2024-02-22 11:31:44 -05:00
Andrei Betlen
045cc12670
docs: Update README
2024-02-22 03:53:52 -05:00
Andrei Betlen
e10af30cf1
fix: TypeAlias import error
2024-02-22 03:27:28 -05:00
Andrei Betlen
3561ebf536
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main
2024-02-22 03:25:13 -05:00
Andrei Betlen
32efed7b07
docs: Update README
2024-02-22 03:25:11 -05:00
Andrei Betlen
d80c5cf29d
docs: fix indentation for mkdocs-material
2024-02-22 02:30:24 -05:00
Andrei Betlen
aefcb8f71a
misc: additional type annotations for low level api
2024-02-22 02:00:09 -05:00
Andrei Betlen
3921e10770
feat: support minItems/maxItems in JSON grammar converter (by @nopperl)
2024-02-22 00:17:06 -05:00
Andrei Betlen
e6d6260a91
fix: Update from_pretrained defaults to match hf_hub_download
2024-02-22 00:10:23 -05:00
Andrei Betlen
dd22010e85
fix: Raise exceptions when llama model or context fails to load
2024-02-22 00:09:45 -05:00
Andrei Betlen
3632241e98
chore: Bump version
2024-02-21 23:09:13 -05:00
Andrei Betlen
0653e15c20
feat: Update llama.cpp
2024-02-21 23:04:52 -05:00
Andrei Betlen
7981e9ce1e
chore: Bump version
2024-02-21 16:30:59 -05:00
Andrei Betlen
7f3962e11c
feat: Update llama.cpp
2024-02-21 16:27:56 -05:00
Andrei Betlen
14191e9036
docs: Add create_chat_completion_openai_v1 to api reference
2024-02-21 16:26:49 -05:00
Andrei Betlen
fe5626cd40
misc: add .local pattern to gitignore
2024-02-21 16:26:30 -05:00
Andrei
7f51b6071f
feat(low-level-api): Improve API static type-safety and performance ( #1205 )
2024-02-21 16:25:38 -05:00
Andrei
0f8aa4ab5c
feat: Pull models directly from huggingface ( #1206 )
...
* Add from_pretrained method to Llama class
* Update docs
* Merge filename and pattern
2024-02-21 16:25:10 -05:00
Andrei Betlen
e42f62c247
chore: Bump version
2024-02-21 11:09:40 -05:00
Andrei Betlen
4edde21b3d
feat: Update llama.cpp
2024-02-21 11:05:58 -05:00
Andrei Betlen
f57b01ac9b
ci: add debug build to dev makefile
2024-02-21 11:04:30 -05:00
Andrei Betlen
04fe33b999
feat: Update llama.cpp
2024-02-20 02:59:02 -05:00
Andrei Betlen
d122bd7858
feat: Update llama.cpp
2024-02-19 22:10:16 -05:00
Andrei Betlen
6225f027e5
feat: Update llama.cpp
2024-02-19 04:11:34 -05:00
Andrei Betlen
748c0ce057
feat: Update llama.cpp
2024-02-18 21:30:36 -05:00
Andrei Betlen
53f6f5f415
fix: self.numa missing
2024-02-17 01:02:33 -05:00
Andrei Betlen
fdce078cb9
feat: Update llama.cpp
2024-02-17 00:37:51 -05:00
Andrei Betlen
c2a234a086
docs: Add embeddings section
2024-02-15 23:15:50 -05:00
Andrei Betlen
f736827b9b
chore: Bump version
2024-02-15 23:10:50 -05:00
Andrei Betlen
0ce66bc080
fix: create_embedding broken response for input type str
2024-02-15 16:09:48 -05:00
khimaros
ea1f88dd29
fix: Use '\n' seperator for EventSourceResponse ( #1188 )
...
this fixes compatibility with some OpenAI clients, including BetterChatGPT (https://github.com/ztjhz/BetterChatGPT/issues/537 ).
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-15 15:20:13 -05:00
Andrei Betlen
a5cfeb7763
feat: Update llama.cpp
2024-02-15 15:17:30 -05:00
Douglas Hanley
7bb91f025f
fix: Incorporate embedding pooling layer fixes ( #1194 )
...
* remove division by token count
* truncate to n_batch, not n_ctx
2024-02-15 15:16:30 -05:00
Andrei Betlen
ae71ad1a14
Bump version
2024-02-14 04:31:42 -05:00
Andrei Betlen
f300d4310a
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main
2024-02-14 04:27:33 -05:00
Andrei Betlen
c336f78269
Update llama.cpp
2024-02-14 04:27:30 -05:00
Douglas Hanley
d7a67917ba
feat: Support batch embeddings ( #1186 )
...
* handle batched embeddings
* fix normalization issue
* fix type hints, ensure no breaking changes to embed
* Clear kv cache / reset internal state after embedding complete
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-14 04:26:09 -05:00
Andrei Betlen
36b843228f
misc: fix makefile build commands
2024-02-14 03:47:40 -05:00
Andrei Betlen
7b9960d1cb
Update llama.cpp
2024-02-14 03:47:21 -05:00
Andrei Betlen
6943bab6d8
fix: destructor exception where internal classes are missing some uninitialized attributes
2024-02-14 03:38:41 -05:00
Andrei Betlen
07a783779a
fix: Update openbuddy prompt format. Closes #1155
2024-02-13 23:57:10 -05:00
Andrei Betlen
7a79e5ac49
Update llama.cpp
2024-02-13 23:54:05 -05:00
Andrei Betlen
7dbbfdecad
fix: submodule kompute is not included in sdist. Closes #1165
2024-02-13 23:53:56 -05:00
Andrei Betlen
345215a76c
fix: more chatml-function-calling fixes
2024-02-13 23:02:50 -05:00
Andrei Betlen
b1637c2319
Bump version
2024-02-13 12:35:04 -05:00
Andrew Lapp
d6be5333e1
fix: sample idx off-by-one error for logit_processors ( #1179 )
...
* fix sample_idx off-by-one error
* self._scores is indexed differently, only modify the index within self._input_ids
---------
Co-authored-by: Andrew Lapp <andrew@rew.la>
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-13 12:26:07 -05:00
Andrei Betlen
f7cdf78788
Update llama.cpp
2024-02-13 12:24:00 -05:00