Commit graph

1729 commits

Author SHA1 Message Date
Andrei Betlen
b681674bf2 docs: Fix functionary repo_id 2024-02-23 12:36:13 -05:00
Andrei Betlen
f94faab686 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main 2024-02-23 12:34:03 -05:00
Andrei Betlen
702306b381 docs: Restore functionary docs in README 2024-02-23 12:34:02 -05:00
Jeffrey Fong
bce6dc0ac2
docs: Update Functionary OpenAI Server Readme (#1193)
* update functionary parts in server readme

* add write-up about hf tokenizer
2024-02-23 12:24:10 -05:00
Andrei Betlen
47bad30dd7 fix: LlamaHFTokenizer now receives pre_tokens 2024-02-23 12:23:24 -05:00
Andrei Betlen
ded5d627a5 chore: Bump version 2024-02-23 11:32:43 -05:00
Luke Stanley
858496224e
feat: Auto detect Mixtral's slightly different format (#1214) 2024-02-23 11:27:38 -05:00
Andrei Betlen
db776a885c fix: module 'llama_cpp.llama_cpp' has no attribute 'c_uint8' 2024-02-23 11:24:53 -05:00
Andrei Betlen
427d816ebf chore: Bump version 2024-02-23 04:54:08 -05:00
Aditya Purandare
52d9d70076
docs: Update README.md to fix pip install llama cpp server (#1187)
Without the single quotes, when running the command, an error is printed saying no matching packages found on pypi. Adding the quotes fixes it

```bash
$ pip install llama-cpp-python[server]
zsh: no matches found: llama-cpp-python[server]
```

Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-23 04:41:22 -05:00
Alvaro Bartolome
251a8a2cad
feat: Add Google's Gemma formatting via chat_format="gemma" (#1210)
* Add Google's Gemma formatting via `chat_format="gemma"`

* Replace `raise ValueError` with `logger.debug`

Co-authored-by: Andrei <abetlen@gmail.com>

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-23 04:40:52 -05:00
Andrei Betlen
eebb102df7 feat: Update llama.cpp 2024-02-23 03:42:08 -05:00
Andrei Betlen
5f96621e92 misc: only search tests folder for tests 2024-02-23 03:40:25 -05:00
Andrei Betlen
b9aca612af misc: use typesafe byref for internal classes 2024-02-23 03:40:07 -05:00
Andrei Betlen
a0ce429dc0 misc: use decorator to bind low level api functions, fixes docs 2024-02-23 03:39:38 -05:00
Andrei Betlen
410e02da51 docs: Fix typo 2024-02-23 00:43:31 -05:00
Andrei Betlen
eb56ce2e2a docs: fix low-level api example 2024-02-22 11:33:05 -05:00
Andrei Betlen
0f8cad6cb7 docs: Update README 2024-02-22 11:31:44 -05:00
Andrei Betlen
045cc12670 docs: Update README 2024-02-22 03:53:52 -05:00
Andrei Betlen
e10af30cf1 fix: TypeAlias import error 2024-02-22 03:27:28 -05:00
Andrei Betlen
3561ebf536 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main 2024-02-22 03:25:13 -05:00
Andrei Betlen
32efed7b07 docs: Update README 2024-02-22 03:25:11 -05:00
Andrei Betlen
d80c5cf29d docs: fix indentation for mkdocs-material 2024-02-22 02:30:24 -05:00
Andrei Betlen
aefcb8f71a misc: additional type annotations for low level api 2024-02-22 02:00:09 -05:00
Andrei Betlen
3921e10770 feat: support minItems/maxItems in JSON grammar converter (by @nopperl) 2024-02-22 00:17:06 -05:00
Andrei Betlen
e6d6260a91 fix: Update from_pretrained defaults to match hf_hub_download 2024-02-22 00:10:23 -05:00
Andrei Betlen
dd22010e85 fix: Raise exceptions when llama model or context fails to load 2024-02-22 00:09:45 -05:00
Andrei Betlen
3632241e98 chore: Bump version 2024-02-21 23:09:13 -05:00
Andrei Betlen
0653e15c20 feat: Update llama.cpp 2024-02-21 23:04:52 -05:00
Andrei Betlen
7981e9ce1e chore: Bump version 2024-02-21 16:30:59 -05:00
Andrei Betlen
7f3962e11c feat: Update llama.cpp 2024-02-21 16:27:56 -05:00
Andrei Betlen
14191e9036 docs: Add create_chat_completion_openai_v1 to api reference 2024-02-21 16:26:49 -05:00
Andrei Betlen
fe5626cd40 misc: add .local pattern to gitignore 2024-02-21 16:26:30 -05:00
Andrei
7f51b6071f
feat(low-level-api): Improve API static type-safety and performance (#1205) 2024-02-21 16:25:38 -05:00
Andrei
0f8aa4ab5c
feat: Pull models directly from huggingface (#1206)
* Add from_pretrained method to Llama class

* Update docs

* Merge filename and pattern
2024-02-21 16:25:10 -05:00
Andrei Betlen
e42f62c247 chore: Bump version 2024-02-21 11:09:40 -05:00
Andrei Betlen
4edde21b3d feat: Update llama.cpp 2024-02-21 11:05:58 -05:00
Andrei Betlen
f57b01ac9b ci: add debug build to dev makefile 2024-02-21 11:04:30 -05:00
Andrei Betlen
04fe33b999 feat: Update llama.cpp 2024-02-20 02:59:02 -05:00
Andrei Betlen
d122bd7858 feat: Update llama.cpp 2024-02-19 22:10:16 -05:00
Andrei Betlen
6225f027e5 feat: Update llama.cpp 2024-02-19 04:11:34 -05:00
Andrei Betlen
748c0ce057 feat: Update llama.cpp 2024-02-18 21:30:36 -05:00
Andrei Betlen
53f6f5f415 fix: self.numa missing 2024-02-17 01:02:33 -05:00
Andrei Betlen
fdce078cb9 feat: Update llama.cpp 2024-02-17 00:37:51 -05:00
Andrei Betlen
c2a234a086 docs: Add embeddings section 2024-02-15 23:15:50 -05:00
Andrei Betlen
f736827b9b chore: Bump version 2024-02-15 23:10:50 -05:00
Andrei Betlen
0ce66bc080 fix: create_embedding broken response for input type str 2024-02-15 16:09:48 -05:00
khimaros
ea1f88dd29
fix: Use '\n' seperator for EventSourceResponse (#1188)
this fixes compatibility with some OpenAI clients, including BetterChatGPT (https://github.com/ztjhz/BetterChatGPT/issues/537).

Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-15 15:20:13 -05:00
Andrei Betlen
a5cfeb7763 feat: Update llama.cpp 2024-02-15 15:17:30 -05:00
Douglas Hanley
7bb91f025f
fix: Incorporate embedding pooling layer fixes (#1194)
* remove division by token count

* truncate to n_batch, not n_ctx
2024-02-15 15:16:30 -05:00