* Add draft model param to llama class, implement basic prompt lookup decoding draft model
* Use samplingcontext for sampling
* Use 1d array
* Use draft model for sampling
* Fix dumb mistake
* Allow for later extensions to the LlamaDraftModel api
* Cleanup
* Adaptive candidate prediction
* Update implementation to match hf transformers
* Tuning
* Fix bug where last token was not used for ngram prediction
* Remove heuristic for num_pred_tokens (no benefit)
* fix: n_candidates bug.
* Add draft_model_num_pred_tokens server setting
* Cleanup
* Update README
* Added mistral instruct chat format as "mistral"
* Fix stop sequence (merge issue)
* Update chat format name to `mistral-instruct`
---------
Co-authored-by: Andrei <abetlen@gmail.com>
* feat: Add support for jinja templating
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
* fix: Refactor chat formatter and update interface for jinja templates
- Simplify the `llama2_template` in `llama_jinja_format.py` by removing unnecessary line breaks for readability without affecting functionality.
- Update `ChatFormatterInterface` constructor to accept a more generic `Optional[object]` type for the template parameter, enhancing flexibility.
- Introduce a `template` property to `ChatFormatterInterface` for standardized access to the template string.
- Replace `MetaSingleton` metaclass with `Singleton` for the `ChatFormatterFactory` to streamline the singleton implementation.
These changes enhance code readability, maintain usability, and ensure consistency in the chat formatter's design pattern usage.
* Add outline for Jinja2 templating integration documentation
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
* Add jinja2 as a dependency with version range for Hugging Face transformers compatibility
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
* Update jinja2 version constraint for mkdocs-material compatibility
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
* Fix attribute name in AutoChatFormatter
- Changed attribute name from `self._renderer` to `self._environment`
---------
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>