Commit graph

1668 commits

Author SHA1 Message Date
Andrei Betlen
843e77e3e2 docs: Add Vulkan build instructions 2024-01-29 11:01:26 -05:00
Andrei Betlen
464af5b39f Bump version 2024-01-29 10:46:04 -05:00
Andrei Betlen
9f7852acfa misc: Add vulkan target 2024-01-29 10:39:23 -05:00
Andrei Betlen
85f8c4c06e Update llama.cpp 2024-01-29 10:39:08 -05:00
Andrei Betlen
9ae5819ee4 Add chat format test. 2024-01-29 00:59:01 -05:00
Rafaelblsilva
ce38dbdf07
Add mistral instruct chat format as "mistral-instruct" (#799)
* Added mistral instruct chat format as "mistral"

* Fix stop sequence (merge issue)

* Update chat format name to `mistral-instruct`

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-01-29 00:34:42 -05:00
Andrei Betlen
52c4a84faf Bump version 2024-01-28 19:35:37 -05:00
Andrei Betlen
31e0288a41 Update llama.cpp 2024-01-28 19:34:27 -05:00
Andrei Betlen
ccf4908bfd Update llama.cpp 2024-01-28 12:55:32 -05:00
Andrei Betlen
8c59210062 docs: Fix typo 2024-01-27 19:37:59 -05:00
Andrei Betlen
399fa1e03b docs: Add JSON and JSON schema mode examples to README 2024-01-27 19:36:33 -05:00
Andrei Betlen
c1d0fff8a9 Bump version 2024-01-27 18:36:56 -05:00
Andrei
d8f6914f45
Add json schema mode (#1122)
* Add json schema mode

* Add llava chat format support
2024-01-27 16:52:18 -05:00
Andrei Betlen
c6d3bd62e8 Update llama.cpp 2024-01-27 16:22:46 -05:00
Andrei Betlen
35918873b4 Update llama.cpp 2024-01-26 11:45:48 -05:00
Andrei Betlen
f5cc6b3053 Bump version 2024-01-25 11:28:16 -05:00
Andrei Betlen
cde7514c3d feat(server): include llama-cpp-python version in openapi spec 2024-01-25 11:23:18 -05:00
Andrei Betlen
2588f34a22 Update llama.cpp 2024-01-25 11:22:42 -05:00
Andrei Betlen
dc5a436224 Update llama.cpp 2024-01-25 11:19:34 -05:00
Andrei Betlen
d6fb16e055 docs: Update README 2024-01-25 10:51:48 -05:00
Andrei Betlen
5b258bf840 docs: Update README with more param common examples 2024-01-24 10:51:15 -05:00
Andrei Betlen
c343baaba8 Update llama.cpp 2024-01-24 10:40:50 -05:00
Andrei Betlen
c970d41a85 fix: llama_log_set should be able to accept null pointer 2024-01-24 10:38:30 -05:00
Andrei Betlen
9677a1f2c8 fix: Check order 2024-01-23 22:28:03 -05:00
Andrei Betlen
4d6b2f7b91 fix: format 2024-01-23 22:08:27 -05:00
Phil H
fe5d6ea648
fix: GGUF metadata KV overrides, re #1011 (#1116)
* kv overrides another attempt

* add sentinel element, simplify array population

* ensure sentinel element is zeroed
2024-01-23 22:00:38 -05:00
Andrei Betlen
7e63928bc9 Update llama.cpp 2024-01-23 18:42:39 -05:00
Andrei Betlen
fcdf337d84 Update llama.cpp 2024-01-22 11:25:11 -05:00
Andrei Betlen
5b982d0f8c fix: use both eos and bos tokens as stop sequences for hf-tokenizer-config chat format. 2024-01-22 08:32:48 -05:00
Andrei Betlen
2ce0b8aa2c Bump version 2024-01-21 20:30:24 -05:00
Andrei Betlen
d3f5528ca8 fix: from_json_schema oneof/anyof bug. Closes #1097 2024-01-21 19:06:53 -05:00
Andrei Betlen
8eefdbca03 Update llama.cpp 2024-01-21 19:01:27 -05:00
Andrei Betlen
88fbccaaa3 docs: Add macosx wrong arch fix to README 2024-01-21 18:38:44 -05:00
Andrei Betlen
24f39454e9 fix: pass chat handler not chat formatter for huggingface autotokenizer and tokenizer_config formats. 2024-01-21 18:38:04 -05:00
Andrei Betlen
7f3209b1eb feat: Add add_generation_prompt option for jinja2chatformatter. 2024-01-21 18:37:24 -05:00
Andrei Betlen
ac2e96d4b4 Update llama.cpp 2024-01-19 15:33:43 -05:00
Andrei Betlen
be09318c26 feat: Add Jinja2ChatFormatter 2024-01-19 15:04:42 -05:00
Andrei Betlen
5a34c57e54 feat: Expose gguf model metadata in metadata property 2024-01-19 10:46:03 -05:00
Andrei Betlen
833a7f1a86 Bump version 2024-01-19 09:03:35 -05:00
Andrei Betlen
e21c3c7a91 Update makefile 2024-01-19 08:47:56 -05:00
Andrei Betlen
0f54948482 Update llama.cpp 2024-01-19 08:41:52 -05:00
Andrei Betlen
3babe3512c Fix mirostat sampling 2024-01-19 08:31:59 -05:00
Andrei Betlen
141293a75b Fix python3.8 support 2024-01-19 08:17:49 -05:00
Andrei Betlen
656f3d8968 Bump version 2024-01-18 21:30:36 -05:00
Andrei Betlen
03ed547bfd Remove templates doc 2024-01-18 21:23:26 -05:00
Andrei Betlen
3ca86ab390 Update llama.cpp 2024-01-18 21:22:45 -05:00
Andrei Betlen
be23404ed4 Cleanup pyproject 2024-01-18 21:22:19 -05:00
Andrei Betlen
89cce50f8c Update llama.cpp 2024-01-18 21:21:49 -05:00
Andrei Betlen
b8fc1c7d83 feat: Add ability to load chat format from huggingface autotokenizer or tokenizer_config.json files. 2024-01-18 21:21:37 -05:00
Andrei Betlen
48c3b77e6f Offload KQV by default 2024-01-18 11:08:57 -05:00