Andrei Betlen
ac55d0a175
fix: Clear kv cache to avoid kv bug when image is evaluated first
2024-05-10 02:38:10 -04:00
Sigbjørn Skjæret
561e880654
fix(security): Render all jinja templates in immutable sandbox ( #1441 )
...
Chat templates are rendered with ImmutableSandboxedEnvironment in transformers so no need to do otherwise here.
Co-authored-by: Andrei <abetlen@gmail.com>
2024-05-10 00:49:40 -04:00
Patrick Peng
b454f40a9a
Merge pull request from GHSA-56xg-wfcc-g829
...
Co-authored-by: Andrei <abetlen@gmail.com>
2024-05-10 00:47:56 -04:00
Andrei Betlen
3757328b70
fix: free last image embed in llava chat handler
2024-05-08 22:16:18 -04:00
Andrei Betlen
77122638b4
fix: Make leading bos_token optional for image chat formats, fix nanollava system message
2024-05-08 13:12:31 -04:00
Sarunas Kalade
903b28adf5
fix: adding missing args in create_completion for functionary chat handler ( #1430 )
2024-05-08 02:21:27 -04:00
Jeffrey Fong
1f56c648c3
feat: Implement streaming for Functionary v2 + Bug fixes ( #1419 )
...
* set up streaming for v2
* assert v2 streaming, fix tool_call vs function_call
* fix streaming with tool_choice/function_call
* make functions return 1 function call only when 'auto'
* fix
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-05-04 10:11:20 -04:00
Andrei Betlen
31b1d95a6c
feat: Add llama-3-vision-alpha chat format
2024-05-02 11:32:18 -04:00
Andrei Betlen
4f01c452b6
fix: Change default verbose value of verbose in image chat format handlers to True to match Llama
2024-04-30 15:50:30 -04:00
Andrei Betlen
3489ef09d3
fix: Ensure image renders before text in chat formats regardless of message content order.
2024-04-30 03:08:46 -04:00
Andrei
fe2da09538
feat: Generic Chat Formats, Tool Calling, and Huggingface Pull Support for Multimodal Models (Obsidian, LLaVA1.6, Moondream) ( #1147 )
...
* Test dummy image tags in chat templates
* Format and improve types for llava_cpp.py
* Add from_pretrained support to llava chat format.
* Refactor llava chat format to use a jinja2
* Revert chat format test
* Add moondream support (wip)
* Update moondream chat format
* Update moondream chat format
* Update moondream prompt
* Add function calling support
* Cache last image embed
* Add Llava1.6 support
* Add nanollava support
* Add obisidian support
* Remove unnecessary import
* Re-order multimodal chat formats
* Logits all no longer required for multi-modal models
* Update README.md
* Update docs
* Update README
* Fix typo
* Update README
* Fix typo
2024-04-30 01:35:38 -04:00
Jeffrey Fong
f178636e1b
fix: Functionary bug fixes ( #1385 )
...
* fix completion tokens tracking, prompt forming
* fix 'function_call' and 'tool_calls' depending on 'functions' and 'tools', incompatibility with python 3.8
* Updated README
* fix for openai server compatibility
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-04-27 20:49:52 -04:00
abk16
8559e8ce88
feat: Add Llama-3 chat format ( #1371 )
...
* feat: Add Llama-3 chat format
* feat: Auto-detect Llama-3 chat format from gguf template
* feat: Update llama.cpp to b2715
Includes proper Llama-3 <|eot_id|> token handling.
---------
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-04-23 02:33:29 -04:00
Andrei Betlen
cc81afebf0
feat: Add stopping_criteria to ChatFormatter, allow stopping on arbitrary token ids, fixes llama3 instruct
2024-04-20 00:00:53 -04:00
Lucca Zenóbio
4f42664955
feat: update grammar schema converter to match llama.cpp ( #1353 )
...
* feat: improve function calling
* feat:grammar
* fix
* fix
* fix
2024-04-18 01:36:25 -04:00
Andrei Betlen
fa4bb0cf81
Revert "feat: Update json to grammar ( #1350 )"
...
This reverts commit 610a592f70
.
2024-04-17 16:18:16 -04:00
Lucca Zenóbio
610a592f70
feat: Update json to grammar ( #1350 )
...
* feat: improve function calling
* feat:grammar
2024-04-17 10:10:21 -04:00
Andrei Betlen
bb65b4d764
fix: pass correct type to chat handlers for chat completion logprobs
2024-04-10 03:41:55 -04:00
Andrei Betlen
1ae3abbcc3
fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 Closes #1314
2024-04-05 10:51:44 -04:00
windspirit95
aa9f1ae011
feat: Add logprobs support to chat completions ( #1311 )
...
* Add logprobs return in ChatCompletionResponse
* Fix duplicate field
* Set default to false
* Simplify check
* Add server example
---------
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-03-31 13:30:13 -04:00
Andrei Betlen
c1325dcdfb
fix: tool_call missing first token.
2024-03-22 23:44:04 -04:00
Andrei
60d8498f21
feat: Add tools/functions variables to Jinja2ChatFormatter, add function response formatting for all simple chat formats ( #1273 )
...
* Add tools/functions variables to Jinja2ChatFormatter
Also fixed missing tools/tool_choices parameters in chat_formatter_to_chat_completion_handler().
* Set grammar when doing explicit function calling
* Add function / tool response for all chat formats
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2024-03-19 04:55:57 -04:00
Jeffrey Fong
8a60c7bc8c
fix: Fix and optimize functionary chat handler ( #1282 )
...
* fix functionary chat logic
* further fixes
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-03-18 10:40:57 -04:00
Andrei Betlen
20e6815252
fix: json mode
2024-03-15 12:58:34 -04:00
Kevin Cao
1f3156d4f2
fix: Check for existence of clip model path ( #1264 )
2024-03-08 21:00:10 -05:00
Andrei Betlen
dbaba3059d
fix: positional arguments only for low-level api
2024-02-26 11:31:11 -05:00
Andrei Betlen
78e536dcfe
fix: typo
2024-02-26 11:14:26 -05:00
Andrei Betlen
8383a9e562
fix: llava this function takes at least 4 arguments (0 given)
2024-02-26 11:03:20 -05:00
Luke Stanley
858496224e
feat: Auto detect Mixtral's slightly different format ( #1214 )
2024-02-23 11:27:38 -05:00
Alvaro Bartolome
251a8a2cad
feat: Add Google's Gemma formatting via chat_format="gemma"
( #1210 )
...
* Add Google's Gemma formatting via `chat_format="gemma"`
* Replace `raise ValueError` with `logger.debug`
Co-authored-by: Andrei <abetlen@gmail.com>
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-23 04:40:52 -05:00
Andrei Betlen
07a783779a
fix: Update openbuddy prompt format. Closes #1155
2024-02-13 23:57:10 -05:00
Andrei Betlen
345215a76c
fix: more chatml-function-calling fixes
2024-02-13 23:02:50 -05:00
Andrei Betlen
68fb71b6a2
fix: missing generation_prompt in chatml-function-calling
2024-02-13 03:24:41 -05:00
Andrei Betlen
4b0e3320bd
fix: minor formatting bugs for chatml-function-calling
2024-02-13 03:11:35 -05:00
Andrei
153a0049d9
feat: Generic chatml Function Calling ( #957 )
...
* Add demo notebook
* Add initial chat handler
* Update OpenAI types
* Add generic chatml function calling (wip)
* Update chatml generic function calling.
* Progress on auto-tool calls
* fix streaming functions
* Remove print statements
* fix: Suppress output from llama.cpp init and grammar creation
* Add OpenAI v1 python api compatible chat completion function
* Support non-streaming multi-tool calls
* Format
* Include function_call in response.
2024-02-12 15:56:07 -05:00
Jeffrey Fong
901827013b
feat: Integrate functionary v1.4 and v2 models + add custom tokenizer support to Llama class ( #1078 )
...
* convert functionary-v1 chat handler to use hf autotokenizer
* add hf_tokenizer + inteegrate functionary-v1.4 prompt template
* integrate functionary v2 prompt template
* update readme
* set up parallel function calling wip
* set up parallel function calling
* Update README.md
* Update README.md
* refactor tokenizers
* include old functionary handler for backward compatibility
* add hf_tokenizer_path in server ModelSettings
* convert functionary-v1 chat handler to use hf autotokenizer
* add hf_tokenizer + inteegrate functionary-v1.4 prompt template
* integrate functionary v2 prompt template
* update readme
* set up parallel function calling wip
* resolve merge conflict
* Update README.md
* Update README.md
* refactor tokenizers
* include old functionary handler for backward compatibility
* add hf_tokenizer_path in server ModelSettings
* Cleanup PR, fix breaking changes
* Use hf_pretrained_model_name_or_path for tokenizer
* fix hf tokenizer in streaming
* update README
* refactor offset mapping
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-07 20:07:03 -05:00
Andrei Betlen
078cca0361
fix: Pass raise_exception and add_generation_prompt to jinja2 chat template
2024-01-31 08:42:21 -05:00
Andrei
da003d8768
Automatically set chat format from gguf ( #1110 )
...
* Use jinja formatter to load chat format from gguf
* Fix off-by-one error in metadata loader
* Implement chat format auto-detection
2024-01-29 14:22:23 -05:00
Andrei Betlen
9ae5819ee4
Add chat format test.
2024-01-29 00:59:01 -05:00
Rafaelblsilva
ce38dbdf07
Add mistral instruct chat format as "mistral-instruct" ( #799 )
...
* Added mistral instruct chat format as "mistral"
* Fix stop sequence (merge issue)
* Update chat format name to `mistral-instruct`
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-01-29 00:34:42 -05:00
Andrei
d8f6914f45
Add json schema mode ( #1122 )
...
* Add json schema mode
* Add llava chat format support
2024-01-27 16:52:18 -05:00
Andrei Betlen
5b982d0f8c
fix: use both eos and bos tokens as stop sequences for hf-tokenizer-config chat format.
2024-01-22 08:32:48 -05:00
Andrei Betlen
7f3209b1eb
feat: Add add_generation_prompt option for jinja2chatformatter.
2024-01-21 18:37:24 -05:00
Andrei Betlen
be09318c26
feat: Add Jinja2ChatFormatter
2024-01-19 15:04:42 -05:00
Andrei Betlen
b8fc1c7d83
feat: Add ability to load chat format from huggingface autotokenizer or tokenizer_config.json files.
2024-01-18 21:21:37 -05:00
Fedor Moiseev
907b9e9d42
Add Saiga chat format. ( #1050 )
2024-01-04 18:12:58 -05:00
xaviviro
cf743ec5d3
Added ChatGLM chat format ( #1059 )
...
Co-authored-by: Xavier Vinaixa Rosello <xaviviro@MacBook-Pro-de-Xavier.local>
2024-01-04 18:12:02 -05:00
yhfgyyf
8b4db732bd
Add qwen chat format ( #1005 )
2023-12-13 21:43:43 -05:00
chiensen
b938cccf05
Add Pygmalion chat format ( #986 )
2023-12-11 20:44:04 -05:00
Gardner Bickford
c2d63a7148
fix: Typo in the Open Orca chat format #874 ( #947 )
2023-11-26 15:39:18 -05:00