Commit graph

76 commits

Author SHA1 Message Date
Sigbjørn Skjæret
027f7bc678
fix: Avoid duplicate special tokens in chat formats (#1439)
* Templates sometimes have BOS in them, remove duplicate

* tokenize chat format prompts before completion

This is to ensure that we don't duplicate any special tokens.

Hopefully I amended the existing formats correctly?

* updated comment

* corrected a few

* add some missing internals

* proper bos/eos detection

* just let tokenizer do the job

* typo--

* align test with new response

* changed to a warning

* move to another PR

* Use python warnings module

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-06-04 10:15:41 -04:00
Andrei Betlen
6b018e00b1 misc: Improve llava error messages 2024-06-03 11:19:10 -04:00
Andrei Betlen
165b4dc6c1 fix: Fix typo in Llama3VisionAlphaChatHandler. Closes #1488 2024-05-29 02:29:44 -04:00
Andrei Betlen
ac55d0a175 fix: Clear kv cache to avoid kv bug when image is evaluated first 2024-05-10 02:38:10 -04:00
Sigbjørn Skjæret
561e880654
fix(security): Render all jinja templates in immutable sandbox (#1441)
Chat templates are rendered with ImmutableSandboxedEnvironment in transformers so no need to do otherwise here.

Co-authored-by: Andrei <abetlen@gmail.com>
2024-05-10 00:49:40 -04:00
Patrick Peng
b454f40a9a
Merge pull request from GHSA-56xg-wfcc-g829
Co-authored-by: Andrei <abetlen@gmail.com>
2024-05-10 00:47:56 -04:00
Andrei Betlen
3757328b70 fix: free last image embed in llava chat handler 2024-05-08 22:16:18 -04:00
Andrei Betlen
77122638b4 fix: Make leading bos_token optional for image chat formats, fix nanollava system message 2024-05-08 13:12:31 -04:00
Sarunas Kalade
903b28adf5
fix: adding missing args in create_completion for functionary chat handler (#1430) 2024-05-08 02:21:27 -04:00
Jeffrey Fong
1f56c648c3
feat: Implement streaming for Functionary v2 + Bug fixes (#1419)
* set up streaming for v2

* assert v2 streaming, fix tool_call vs function_call

* fix streaming with tool_choice/function_call

* make functions return 1 function call only when 'auto'

* fix

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-05-04 10:11:20 -04:00
Andrei Betlen
31b1d95a6c feat: Add llama-3-vision-alpha chat format 2024-05-02 11:32:18 -04:00
Andrei Betlen
4f01c452b6 fix: Change default verbose value of verbose in image chat format handlers to True to match Llama 2024-04-30 15:50:30 -04:00
Andrei Betlen
3489ef09d3 fix: Ensure image renders before text in chat formats regardless of message content order. 2024-04-30 03:08:46 -04:00
Andrei
fe2da09538
feat: Generic Chat Formats, Tool Calling, and Huggingface Pull Support for Multimodal Models (Obsidian, LLaVA1.6, Moondream) (#1147)
* Test dummy image tags in chat templates

* Format and improve  types for llava_cpp.py

* Add from_pretrained support to llava chat format.

* Refactor llava chat format to use a jinja2

* Revert chat format test

* Add moondream support (wip)

* Update moondream chat format

* Update moondream chat format

* Update moondream prompt

* Add function calling support

* Cache last image embed

* Add Llava1.6 support

* Add nanollava support

* Add obisidian support

* Remove unnecessary import

* Re-order multimodal chat formats

* Logits all no longer required for multi-modal models

* Update README.md

* Update docs

* Update README

* Fix typo

* Update README

* Fix typo
2024-04-30 01:35:38 -04:00
Jeffrey Fong
f178636e1b
fix: Functionary bug fixes (#1385)
* fix completion tokens tracking, prompt forming

* fix 'function_call' and 'tool_calls' depending on 'functions' and 'tools', incompatibility with python 3.8

* Updated README

* fix for openai server compatibility

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-04-27 20:49:52 -04:00
abk16
8559e8ce88
feat: Add Llama-3 chat format (#1371)
* feat: Add Llama-3 chat format

* feat: Auto-detect Llama-3 chat format from gguf template

* feat: Update llama.cpp to b2715

Includes proper Llama-3 <|eot_id|> token handling.

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-04-23 02:33:29 -04:00
Andrei Betlen
cc81afebf0 feat: Add stopping_criteria to ChatFormatter, allow stopping on arbitrary token ids, fixes llama3 instruct 2024-04-20 00:00:53 -04:00
Lucca Zenóbio
4f42664955
feat: update grammar schema converter to match llama.cpp (#1353)
* feat: improve function calling

* feat:grammar

* fix

* fix

* fix
2024-04-18 01:36:25 -04:00
Andrei Betlen
fa4bb0cf81 Revert "feat: Update json to grammar (#1350)"
This reverts commit 610a592f70.
2024-04-17 16:18:16 -04:00
Lucca Zenóbio
610a592f70
feat: Update json to grammar (#1350)
* feat: improve function calling

* feat:grammar
2024-04-17 10:10:21 -04:00
Andrei Betlen
bb65b4d764 fix: pass correct type to chat handlers for chat completion logprobs 2024-04-10 03:41:55 -04:00
Andrei Betlen
1ae3abbcc3 fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 Closes #1314 2024-04-05 10:51:44 -04:00
windspirit95
aa9f1ae011
feat: Add logprobs support to chat completions (#1311)
* Add logprobs return in ChatCompletionResponse

* Fix duplicate field

* Set default to false

* Simplify check

* Add server example

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-03-31 13:30:13 -04:00
Andrei Betlen
c1325dcdfb fix: tool_call missing first token. 2024-03-22 23:44:04 -04:00
Andrei
60d8498f21
feat: Add tools/functions variables to Jinja2ChatFormatter, add function response formatting for all simple chat formats (#1273)
* Add tools/functions variables to Jinja2ChatFormatter

Also fixed missing tools/tool_choices parameters in chat_formatter_to_chat_completion_handler().

* Set grammar when doing explicit function calling

* Add function / tool response for all chat formats

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2024-03-19 04:55:57 -04:00
Jeffrey Fong
8a60c7bc8c
fix: Fix and optimize functionary chat handler (#1282)
* fix functionary chat logic

* further fixes

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-03-18 10:40:57 -04:00
Andrei Betlen
20e6815252 fix: json mode 2024-03-15 12:58:34 -04:00
Kevin Cao
1f3156d4f2
fix: Check for existence of clip model path (#1264) 2024-03-08 21:00:10 -05:00
Andrei Betlen
dbaba3059d fix: positional arguments only for low-level api 2024-02-26 11:31:11 -05:00
Andrei Betlen
78e536dcfe fix: typo 2024-02-26 11:14:26 -05:00
Andrei Betlen
8383a9e562 fix: llava this function takes at least 4 arguments (0 given) 2024-02-26 11:03:20 -05:00
Luke Stanley
858496224e
feat: Auto detect Mixtral's slightly different format (#1214) 2024-02-23 11:27:38 -05:00
Alvaro Bartolome
251a8a2cad
feat: Add Google's Gemma formatting via chat_format="gemma" (#1210)
* Add Google's Gemma formatting via `chat_format="gemma"`

* Replace `raise ValueError` with `logger.debug`

Co-authored-by: Andrei <abetlen@gmail.com>

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-23 04:40:52 -05:00
Andrei Betlen
07a783779a fix: Update openbuddy prompt format. Closes #1155 2024-02-13 23:57:10 -05:00
Andrei Betlen
345215a76c fix: more chatml-function-calling fixes 2024-02-13 23:02:50 -05:00
Andrei Betlen
68fb71b6a2 fix: missing generation_prompt in chatml-function-calling 2024-02-13 03:24:41 -05:00
Andrei Betlen
4b0e3320bd fix: minor formatting bugs for chatml-function-calling 2024-02-13 03:11:35 -05:00
Andrei
153a0049d9
feat: Generic chatml Function Calling (#957)
* Add demo notebook

* Add initial chat handler

* Update OpenAI types

* Add generic chatml function calling (wip)

* Update chatml generic function calling.

* Progress on auto-tool calls

* fix streaming functions

* Remove print statements

* fix: Suppress output from llama.cpp init and grammar creation

* Add OpenAI v1 python api compatible chat completion function

* Support non-streaming multi-tool calls

* Format

* Include function_call in response.
2024-02-12 15:56:07 -05:00
Jeffrey Fong
901827013b
feat: Integrate functionary v1.4 and v2 models + add custom tokenizer support to Llama class (#1078)
* convert functionary-v1 chat handler to use hf autotokenizer

* add hf_tokenizer + inteegrate functionary-v1.4 prompt template

* integrate functionary v2 prompt template

* update readme

* set up parallel function calling wip

* set up parallel function calling

* Update README.md

* Update README.md

* refactor tokenizers

* include old functionary handler for backward compatibility

* add hf_tokenizer_path in server ModelSettings

* convert functionary-v1 chat handler to use hf autotokenizer

* add hf_tokenizer + inteegrate functionary-v1.4 prompt template

* integrate functionary v2 prompt template

* update readme

* set up parallel function calling wip

* resolve merge conflict

* Update README.md

* Update README.md

* refactor tokenizers

* include old functionary handler for backward compatibility

* add hf_tokenizer_path in server ModelSettings

* Cleanup PR, fix breaking changes

* Use hf_pretrained_model_name_or_path for tokenizer

* fix hf tokenizer in streaming

* update README

* refactor offset mapping

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-07 20:07:03 -05:00
Andrei Betlen
078cca0361 fix: Pass raise_exception and add_generation_prompt to jinja2 chat template 2024-01-31 08:42:21 -05:00
Andrei
da003d8768
Automatically set chat format from gguf (#1110)
* Use jinja formatter to load chat format from gguf

* Fix off-by-one error in metadata loader

* Implement chat format auto-detection
2024-01-29 14:22:23 -05:00
Andrei Betlen
9ae5819ee4 Add chat format test. 2024-01-29 00:59:01 -05:00
Rafaelblsilva
ce38dbdf07
Add mistral instruct chat format as "mistral-instruct" (#799)
* Added mistral instruct chat format as "mistral"

* Fix stop sequence (merge issue)

* Update chat format name to `mistral-instruct`

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-01-29 00:34:42 -05:00
Andrei
d8f6914f45
Add json schema mode (#1122)
* Add json schema mode

* Add llava chat format support
2024-01-27 16:52:18 -05:00
Andrei Betlen
5b982d0f8c fix: use both eos and bos tokens as stop sequences for hf-tokenizer-config chat format. 2024-01-22 08:32:48 -05:00
Andrei Betlen
7f3209b1eb feat: Add add_generation_prompt option for jinja2chatformatter. 2024-01-21 18:37:24 -05:00
Andrei Betlen
be09318c26 feat: Add Jinja2ChatFormatter 2024-01-19 15:04:42 -05:00
Andrei Betlen
b8fc1c7d83 feat: Add ability to load chat format from huggingface autotokenizer or tokenizer_config.json files. 2024-01-18 21:21:37 -05:00
Fedor Moiseev
907b9e9d42
Add Saiga chat format. (#1050) 2024-01-04 18:12:58 -05:00
xaviviro
cf743ec5d3
Added ChatGLM chat format (#1059)
Co-authored-by: Xavier Vinaixa Rosello <xaviviro@MacBook-Pro-de-Xavier.local>
2024-01-04 18:12:02 -05:00