Andrei
d8f6914f45
Add json schema mode ( #1122 )
...
* Add json schema mode
* Add llava chat format support
2024-01-27 16:52:18 -05:00
Andrei Betlen
c6d3bd62e8
Update llama.cpp
2024-01-27 16:22:46 -05:00
Andrei Betlen
35918873b4
Update llama.cpp
2024-01-26 11:45:48 -05:00
Andrei Betlen
f5cc6b3053
Bump version
2024-01-25 11:28:16 -05:00
Andrei Betlen
cde7514c3d
feat(server): include llama-cpp-python version in openapi spec
2024-01-25 11:23:18 -05:00
Andrei Betlen
2588f34a22
Update llama.cpp
2024-01-25 11:22:42 -05:00
Andrei Betlen
dc5a436224
Update llama.cpp
2024-01-25 11:19:34 -05:00
Andrei Betlen
d6fb16e055
docs: Update README
2024-01-25 10:51:48 -05:00
Andrei Betlen
5b258bf840
docs: Update README with more param common examples
2024-01-24 10:51:15 -05:00
Andrei Betlen
c343baaba8
Update llama.cpp
2024-01-24 10:40:50 -05:00
Andrei Betlen
c970d41a85
fix: llama_log_set should be able to accept null pointer
2024-01-24 10:38:30 -05:00
Andrei Betlen
9677a1f2c8
fix: Check order
2024-01-23 22:28:03 -05:00
Andrei Betlen
4d6b2f7b91
fix: format
2024-01-23 22:08:27 -05:00
Phil H
fe5d6ea648
fix: GGUF metadata KV overrides, re #1011 ( #1116 )
...
* kv overrides another attempt
* add sentinel element, simplify array population
* ensure sentinel element is zeroed
2024-01-23 22:00:38 -05:00
Andrei Betlen
7e63928bc9
Update llama.cpp
2024-01-23 18:42:39 -05:00
Andrei Betlen
fcdf337d84
Update llama.cpp
2024-01-22 11:25:11 -05:00
Andrei Betlen
5b982d0f8c
fix: use both eos and bos tokens as stop sequences for hf-tokenizer-config chat format.
2024-01-22 08:32:48 -05:00
Andrei Betlen
2ce0b8aa2c
Bump version
2024-01-21 20:30:24 -05:00
Andrei Betlen
d3f5528ca8
fix: from_json_schema oneof/anyof bug. Closes #1097
2024-01-21 19:06:53 -05:00
Andrei Betlen
8eefdbca03
Update llama.cpp
2024-01-21 19:01:27 -05:00
Andrei Betlen
88fbccaaa3
docs: Add macosx wrong arch fix to README
2024-01-21 18:38:44 -05:00
Andrei Betlen
24f39454e9
fix: pass chat handler not chat formatter for huggingface autotokenizer and tokenizer_config formats.
2024-01-21 18:38:04 -05:00
Andrei Betlen
7f3209b1eb
feat: Add add_generation_prompt option for jinja2chatformatter.
2024-01-21 18:37:24 -05:00
Andrei Betlen
ac2e96d4b4
Update llama.cpp
2024-01-19 15:33:43 -05:00
Andrei Betlen
be09318c26
feat: Add Jinja2ChatFormatter
2024-01-19 15:04:42 -05:00
Andrei Betlen
5a34c57e54
feat: Expose gguf model metadata in metadata property
2024-01-19 10:46:03 -05:00
Andrei Betlen
833a7f1a86
Bump version
2024-01-19 09:03:35 -05:00
Andrei Betlen
e21c3c7a91
Update makefile
2024-01-19 08:47:56 -05:00
Andrei Betlen
0f54948482
Update llama.cpp
2024-01-19 08:41:52 -05:00
Andrei Betlen
3babe3512c
Fix mirostat sampling
2024-01-19 08:31:59 -05:00
Andrei Betlen
141293a75b
Fix python3.8 support
2024-01-19 08:17:49 -05:00
Andrei Betlen
656f3d8968
Bump version
2024-01-18 21:30:36 -05:00
Andrei Betlen
03ed547bfd
Remove templates doc
2024-01-18 21:23:26 -05:00
Andrei Betlen
3ca86ab390
Update llama.cpp
2024-01-18 21:22:45 -05:00
Andrei Betlen
be23404ed4
Cleanup pyproject
2024-01-18 21:22:19 -05:00
Andrei Betlen
89cce50f8c
Update llama.cpp
2024-01-18 21:21:49 -05:00
Andrei Betlen
b8fc1c7d83
feat: Add ability to load chat format from huggingface autotokenizer or tokenizer_config.json files.
2024-01-18 21:21:37 -05:00
Andrei Betlen
48c3b77e6f
Offload KQV by default
2024-01-18 11:08:57 -05:00
Austin
6bfe98bd80
Integration of Jinja2 Templating ( #875 )
...
* feat: Add support for jinja templating
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
* fix: Refactor chat formatter and update interface for jinja templates
- Simplify the `llama2_template` in `llama_jinja_format.py` by removing unnecessary line breaks for readability without affecting functionality.
- Update `ChatFormatterInterface` constructor to accept a more generic `Optional[object]` type for the template parameter, enhancing flexibility.
- Introduce a `template` property to `ChatFormatterInterface` for standardized access to the template string.
- Replace `MetaSingleton` metaclass with `Singleton` for the `ChatFormatterFactory` to streamline the singleton implementation.
These changes enhance code readability, maintain usability, and ensure consistency in the chat formatter's design pattern usage.
* Add outline for Jinja2 templating integration documentation
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
* Add jinja2 as a dependency with version range for Hugging Face transformers compatibility
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
* Update jinja2 version constraint for mkdocs-material compatibility
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
* Fix attribute name in AutoChatFormatter
- Changed attribute name from `self._renderer` to `self._environment`
---------
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
2024-01-17 09:47:52 -05:00
Andrei Betlen
52adc23115
Update llama.cpp
2024-01-17 09:27:40 -05:00
Andrei Betlen
7b46bb5a78
Re-order classes in llama.py
2024-01-17 09:16:13 -05:00
Andrei Betlen
cc4630e66f
Move helper classes to _internals submodule
2024-01-17 09:14:00 -05:00
Andrei Betlen
3b92419132
Move cache classes to llama_cache submodule.
2024-01-17 09:09:12 -05:00
Andrei Betlen
6981597835
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main
2024-01-16 19:35:59 -05:00
Andrei Betlen
d5dbb3f8de
Update llama.cpp
2024-01-16 19:35:57 -05:00
Jerry Liu
84380fe9a6
Add llamaindex integration to readme ( #1092 )
2024-01-16 19:10:50 -05:00
Kyle Mistele
9c36688b33
fix(cli): allow passing n_ctx=0 to openAI API server args to use model n_ctx_train field per #1015 ( #1093 )
2024-01-16 18:54:06 -05:00
anil
cfb7da98ed
Support Accept text/event-stream in chat and completion endpoints, resolves #1083 ( #1088 )
...
Co-authored-by: Anil Pathak <anil@heyday.com>
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-01-16 12:52:52 -05:00
Andrei Betlen
e39778f8eb
Update llama.cpp
2024-01-16 11:56:44 -05:00
Andrei Betlen
4b11fa83c0
Bump version
2024-01-15 12:54:51 -05:00