Andrei Betlen
e6d6260a91
fix: Update from_pretrained defaults to match hf_hub_download
2024-02-22 00:10:23 -05:00
Andrei Betlen
dd22010e85
fix: Raise exceptions when llama model or context fails to load
2024-02-22 00:09:45 -05:00
Andrei Betlen
3632241e98
chore: Bump version
2024-02-21 23:09:13 -05:00
Andrei Betlen
0653e15c20
feat: Update llama.cpp
2024-02-21 23:04:52 -05:00
Andrei Betlen
7981e9ce1e
chore: Bump version
2024-02-21 16:30:59 -05:00
Andrei Betlen
7f3962e11c
feat: Update llama.cpp
2024-02-21 16:27:56 -05:00
Andrei Betlen
14191e9036
docs: Add create_chat_completion_openai_v1 to api reference
2024-02-21 16:26:49 -05:00
Andrei Betlen
fe5626cd40
misc: add .local pattern to gitignore
2024-02-21 16:26:30 -05:00
Andrei
7f51b6071f
feat(low-level-api): Improve API static type-safety and performance ( #1205 )
2024-02-21 16:25:38 -05:00
Andrei
0f8aa4ab5c
feat: Pull models directly from huggingface ( #1206 )
...
* Add from_pretrained method to Llama class
* Update docs
* Merge filename and pattern
2024-02-21 16:25:10 -05:00
Andrei Betlen
e42f62c247
chore: Bump version
2024-02-21 11:09:40 -05:00
Andrei Betlen
4edde21b3d
feat: Update llama.cpp
2024-02-21 11:05:58 -05:00
Andrei Betlen
f57b01ac9b
ci: add debug build to dev makefile
2024-02-21 11:04:30 -05:00
Andrei Betlen
04fe33b999
feat: Update llama.cpp
2024-02-20 02:59:02 -05:00
Andrei Betlen
d122bd7858
feat: Update llama.cpp
2024-02-19 22:10:16 -05:00
Andrei Betlen
6225f027e5
feat: Update llama.cpp
2024-02-19 04:11:34 -05:00
Andrei Betlen
748c0ce057
feat: Update llama.cpp
2024-02-18 21:30:36 -05:00
Andrei Betlen
53f6f5f415
fix: self.numa missing
2024-02-17 01:02:33 -05:00
Andrei Betlen
fdce078cb9
feat: Update llama.cpp
2024-02-17 00:37:51 -05:00
Andrei Betlen
c2a234a086
docs: Add embeddings section
2024-02-15 23:15:50 -05:00
Andrei Betlen
f736827b9b
chore: Bump version
2024-02-15 23:10:50 -05:00
Andrei Betlen
0ce66bc080
fix: create_embedding broken response for input type str
2024-02-15 16:09:48 -05:00
khimaros
ea1f88dd29
fix: Use '\n' seperator for EventSourceResponse ( #1188 )
...
this fixes compatibility with some OpenAI clients, including BetterChatGPT (https://github.com/ztjhz/BetterChatGPT/issues/537 ).
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-15 15:20:13 -05:00
Andrei Betlen
a5cfeb7763
feat: Update llama.cpp
2024-02-15 15:17:30 -05:00
Douglas Hanley
7bb91f025f
fix: Incorporate embedding pooling layer fixes ( #1194 )
...
* remove division by token count
* truncate to n_batch, not n_ctx
2024-02-15 15:16:30 -05:00
Andrei Betlen
ae71ad1a14
Bump version
2024-02-14 04:31:42 -05:00
Andrei Betlen
f300d4310a
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main
2024-02-14 04:27:33 -05:00
Andrei Betlen
c336f78269
Update llama.cpp
2024-02-14 04:27:30 -05:00
Douglas Hanley
d7a67917ba
feat: Support batch embeddings ( #1186 )
...
* handle batched embeddings
* fix normalization issue
* fix type hints, ensure no breaking changes to embed
* Clear kv cache / reset internal state after embedding complete
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-14 04:26:09 -05:00
Andrei Betlen
36b843228f
misc: fix makefile build commands
2024-02-14 03:47:40 -05:00
Andrei Betlen
7b9960d1cb
Update llama.cpp
2024-02-14 03:47:21 -05:00
Andrei Betlen
6943bab6d8
fix: destructor exception where internal classes are missing some uninitialized attributes
2024-02-14 03:38:41 -05:00
Andrei Betlen
07a783779a
fix: Update openbuddy prompt format. Closes #1155
2024-02-13 23:57:10 -05:00
Andrei Betlen
7a79e5ac49
Update llama.cpp
2024-02-13 23:54:05 -05:00
Andrei Betlen
7dbbfdecad
fix: submodule kompute is not included in sdist. Closes #1165
2024-02-13 23:53:56 -05:00
Andrei Betlen
345215a76c
fix: more chatml-function-calling fixes
2024-02-13 23:02:50 -05:00
Andrei Betlen
b1637c2319
Bump version
2024-02-13 12:35:04 -05:00
Andrew Lapp
d6be5333e1
fix: sample idx off-by-one error for logit_processors ( #1179 )
...
* fix sample_idx off-by-one error
* self._scores is indexed differently, only modify the index within self._input_ids
---------
Co-authored-by: Andrew Lapp <andrew@rew.la>
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-13 12:26:07 -05:00
Andrei Betlen
f7cdf78788
Update llama.cpp
2024-02-13 12:24:00 -05:00
Andrei Betlen
68fb71b6a2
fix: missing generation_prompt in chatml-function-calling
2024-02-13 03:24:41 -05:00
Andrei Betlen
4b0e3320bd
fix: minor formatting bugs for chatml-function-calling
2024-02-13 03:11:35 -05:00
Andrei Betlen
6fe8b427e1
Bump version
2024-02-13 02:46:52 -05:00
Andrei Betlen
d1822fed6b
fix: Don't change order of json schema object properties unless prop_order is passed, Closes #1180
2024-02-13 02:44:00 -05:00
Andrei Betlen
5efc45bdfd
Update llama.cpp
2024-02-13 02:43:07 -05:00
Andrei Betlen
4348a6cdf0
docs: Fix typo
2024-02-13 02:04:54 -05:00
Andrei Betlen
d605875772
Bump version
2024-02-12 16:28:30 -05:00
Andrei Betlen
b82b0e1014
docs: Temporarily revert function calling docs
2024-02-12 16:27:43 -05:00
Andrei Betlen
cb791716b4
fix: Always set logits_all = True when using speculative decoding
2024-02-12 16:19:05 -05:00
Andrei
153a0049d9
feat: Generic chatml Function Calling ( #957 )
...
* Add demo notebook
* Add initial chat handler
* Update OpenAI types
* Add generic chatml function calling (wip)
* Update chatml generic function calling.
* Progress on auto-tool calls
* fix streaming functions
* Remove print statements
* fix: Suppress output from llama.cpp init and grammar creation
* Add OpenAI v1 python api compatible chat completion function
* Support non-streaming multi-tool calls
* Format
* Include function_call in response.
2024-02-12 15:56:07 -05:00
Andrei Betlen
69413ce08e
Update llama.cpp
2024-02-11 19:00:17 -05:00