Commit graph

1504 commits

Author SHA1 Message Date
Andrei Betlen
e6d6260a91 fix: Update from_pretrained defaults to match hf_hub_download 2024-02-22 00:10:23 -05:00
Andrei Betlen
dd22010e85 fix: Raise exceptions when llama model or context fails to load 2024-02-22 00:09:45 -05:00
Andrei Betlen
3632241e98 chore: Bump version 2024-02-21 23:09:13 -05:00
Andrei Betlen
0653e15c20 feat: Update llama.cpp 2024-02-21 23:04:52 -05:00
Andrei Betlen
7981e9ce1e chore: Bump version 2024-02-21 16:30:59 -05:00
Andrei Betlen
7f3962e11c feat: Update llama.cpp 2024-02-21 16:27:56 -05:00
Andrei Betlen
14191e9036 docs: Add create_chat_completion_openai_v1 to api reference 2024-02-21 16:26:49 -05:00
Andrei Betlen
fe5626cd40 misc: add .local pattern to gitignore 2024-02-21 16:26:30 -05:00
Andrei
7f51b6071f
feat(low-level-api): Improve API static type-safety and performance (#1205) 2024-02-21 16:25:38 -05:00
Andrei
0f8aa4ab5c
feat: Pull models directly from huggingface (#1206)
* Add from_pretrained method to Llama class

* Update docs

* Merge filename and pattern
2024-02-21 16:25:10 -05:00
Andrei Betlen
e42f62c247 chore: Bump version 2024-02-21 11:09:40 -05:00
Andrei Betlen
4edde21b3d feat: Update llama.cpp 2024-02-21 11:05:58 -05:00
Andrei Betlen
f57b01ac9b ci: add debug build to dev makefile 2024-02-21 11:04:30 -05:00
Andrei Betlen
04fe33b999 feat: Update llama.cpp 2024-02-20 02:59:02 -05:00
Andrei Betlen
d122bd7858 feat: Update llama.cpp 2024-02-19 22:10:16 -05:00
Andrei Betlen
6225f027e5 feat: Update llama.cpp 2024-02-19 04:11:34 -05:00
Andrei Betlen
748c0ce057 feat: Update llama.cpp 2024-02-18 21:30:36 -05:00
Andrei Betlen
53f6f5f415 fix: self.numa missing 2024-02-17 01:02:33 -05:00
Andrei Betlen
fdce078cb9 feat: Update llama.cpp 2024-02-17 00:37:51 -05:00
Andrei Betlen
c2a234a086 docs: Add embeddings section 2024-02-15 23:15:50 -05:00
Andrei Betlen
f736827b9b chore: Bump version 2024-02-15 23:10:50 -05:00
Andrei Betlen
0ce66bc080 fix: create_embedding broken response for input type str 2024-02-15 16:09:48 -05:00
khimaros
ea1f88dd29
fix: Use '\n' seperator for EventSourceResponse (#1188)
this fixes compatibility with some OpenAI clients, including BetterChatGPT (https://github.com/ztjhz/BetterChatGPT/issues/537).

Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-15 15:20:13 -05:00
Andrei Betlen
a5cfeb7763 feat: Update llama.cpp 2024-02-15 15:17:30 -05:00
Douglas Hanley
7bb91f025f
fix: Incorporate embedding pooling layer fixes (#1194)
* remove division by token count

* truncate to n_batch, not n_ctx
2024-02-15 15:16:30 -05:00
Andrei Betlen
ae71ad1a14 Bump version 2024-02-14 04:31:42 -05:00
Andrei Betlen
f300d4310a Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main 2024-02-14 04:27:33 -05:00
Andrei Betlen
c336f78269 Update llama.cpp 2024-02-14 04:27:30 -05:00
Douglas Hanley
d7a67917ba
feat: Support batch embeddings (#1186)
* handle batched embeddings

* fix normalization issue

* fix type hints, ensure no breaking changes to embed

* Clear kv cache / reset internal state after embedding complete

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-14 04:26:09 -05:00
Andrei Betlen
36b843228f misc: fix makefile build commands 2024-02-14 03:47:40 -05:00
Andrei Betlen
7b9960d1cb Update llama.cpp 2024-02-14 03:47:21 -05:00
Andrei Betlen
6943bab6d8 fix: destructor exception where internal classes are missing some uninitialized attributes 2024-02-14 03:38:41 -05:00
Andrei Betlen
07a783779a fix: Update openbuddy prompt format. Closes #1155 2024-02-13 23:57:10 -05:00
Andrei Betlen
7a79e5ac49 Update llama.cpp 2024-02-13 23:54:05 -05:00
Andrei Betlen
7dbbfdecad fix: submodule kompute is not included in sdist. Closes #1165 2024-02-13 23:53:56 -05:00
Andrei Betlen
345215a76c fix: more chatml-function-calling fixes 2024-02-13 23:02:50 -05:00
Andrei Betlen
b1637c2319 Bump version 2024-02-13 12:35:04 -05:00
Andrew Lapp
d6be5333e1
fix: sample idx off-by-one error for logit_processors (#1179)
* fix sample_idx off-by-one error

* self._scores is indexed differently, only modify the index within self._input_ids

---------

Co-authored-by: Andrew Lapp <andrew@rew.la>
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-13 12:26:07 -05:00
Andrei Betlen
f7cdf78788 Update llama.cpp 2024-02-13 12:24:00 -05:00
Andrei Betlen
68fb71b6a2 fix: missing generation_prompt in chatml-function-calling 2024-02-13 03:24:41 -05:00
Andrei Betlen
4b0e3320bd fix: minor formatting bugs for chatml-function-calling 2024-02-13 03:11:35 -05:00
Andrei Betlen
6fe8b427e1 Bump version 2024-02-13 02:46:52 -05:00
Andrei Betlen
d1822fed6b fix: Don't change order of json schema object properties unless prop_order is passed, Closes #1180 2024-02-13 02:44:00 -05:00
Andrei Betlen
5efc45bdfd Update llama.cpp 2024-02-13 02:43:07 -05:00
Andrei Betlen
4348a6cdf0 docs: Fix typo 2024-02-13 02:04:54 -05:00
Andrei Betlen
d605875772 Bump version 2024-02-12 16:28:30 -05:00
Andrei Betlen
b82b0e1014 docs: Temporarily revert function calling docs 2024-02-12 16:27:43 -05:00
Andrei Betlen
cb791716b4 fix: Always set logits_all = True when using speculative decoding 2024-02-12 16:19:05 -05:00
Andrei
153a0049d9
feat: Generic chatml Function Calling (#957)
* Add demo notebook

* Add initial chat handler

* Update OpenAI types

* Add generic chatml function calling (wip)

* Update chatml generic function calling.

* Progress on auto-tool calls

* fix streaming functions

* Remove print statements

* fix: Suppress output from llama.cpp init and grammar creation

* Add OpenAI v1 python api compatible chat completion function

* Support non-streaming multi-tool calls

* Format

* Include function_call in response.
2024-02-12 15:56:07 -05:00
Andrei Betlen
69413ce08e Update llama.cpp 2024-02-11 19:00:17 -05:00