Andrei Betlen
0a454bebe6
feat(server): Remove temperature bounds checks for server. Closes #1384
2024-05-03 15:23:06 -04:00
Daniel Thuerck
2138561fab
fix(server): Propagate flash_attn
to model load. ( #1424 )
2024-05-03 12:17:07 -04:00
Andrei Betlen
2117122396
chore: Bump version
2024-05-02 12:07:09 -04:00
Andrei Betlen
d75dea18db
feat: Update llama.cpp
2024-05-02 12:00:44 -04:00
Andrei Betlen
31b1d95a6c
feat: Add llama-3-vision-alpha chat format
2024-05-02 11:32:18 -04:00
Andrei Betlen
4f01c452b6
fix: Change default verbose value of verbose in image chat format handlers to True to match Llama
2024-04-30 15:50:30 -04:00
Andrei Betlen
946156fb6c
feat: Update llama.cpp
2024-04-30 15:46:45 -04:00
Andrei Betlen
9286b5caac
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
2024-04-30 15:45:36 -04:00
Andrei Betlen
f116175a5a
fix: Suppress all logs when verbose=False, use hardcoded fileno's to work in colab notebooks. Closes #796 Closes #729
2024-04-30 15:45:34 -04:00
Jonathan Soma
3226b3c5ef
fix: UTF-8 handling with grammars ( #1415 )
...
Use Python's built-in UTF-8 handling to get code points
2024-04-30 14:33:23 -04:00
Andrei Betlen
945c62c567
docs: Change all examples from interpreter style to script style.
2024-04-30 10:15:04 -04:00
Andrei Betlen
26478ab293
docs: Update README.md
2024-04-30 10:11:38 -04:00
Andrei Betlen
b14dd98922
chore: Bump version
2024-04-30 09:39:56 -04:00
Andrei Betlen
29b6e9a5c8
fix: wrong parameter for flash attention in pickle __getstate__
2024-04-30 09:32:47 -04:00
Andrei Betlen
22d77eefd2
feat: Add option to enable flash_attn
to Lllama params and ModelSettings
2024-04-30 09:29:16 -04:00
Andrei Betlen
8c2b24d5aa
feat: Update llama.cpp
2024-04-30 09:27:55 -04:00
Olivier DEBAUCHE
6332527a69
fix(ci): Fix build-and-release.yaml ( #1413 )
...
* Update build-and-release.yaml
* Update build-and-release.yaml
2024-04-30 09:16:14 -04:00
Andrei Betlen
c8cd8c17c6
docs: Update README to include CUDA 12.4 wheels
2024-04-30 03:12:46 -04:00
Andrei Betlen
f417cce28a
chore: Bump version
2024-04-30 03:11:02 -04:00
Andrei Betlen
3489ef09d3
fix: Ensure image renders before text in chat formats regardless of message content order.
2024-04-30 03:08:46 -04:00
Andrei Betlen
d03f15bb73
fix(ci): Fix bug in use of upload-artifact failing to merge multiple artifacts into a single release.
2024-04-30 02:58:55 -04:00
Andrei Betlen
26c7876ba0
chore: Bump version
2024-04-30 01:48:40 -04:00
Andrei
fe2da09538
feat: Generic Chat Formats, Tool Calling, and Huggingface Pull Support for Multimodal Models (Obsidian, LLaVA1.6, Moondream) ( #1147 )
...
* Test dummy image tags in chat templates
* Format and improve types for llava_cpp.py
* Add from_pretrained support to llava chat format.
* Refactor llava chat format to use a jinja2
* Revert chat format test
* Add moondream support (wip)
* Update moondream chat format
* Update moondream chat format
* Update moondream prompt
* Add function calling support
* Cache last image embed
* Add Llava1.6 support
* Add nanollava support
* Add obisidian support
* Remove unnecessary import
* Re-order multimodal chat formats
* Logits all no longer required for multi-modal models
* Update README.md
* Update docs
* Update README
* Fix typo
* Update README
* Fix typo
2024-04-30 01:35:38 -04:00
Andrei Betlen
97fb860eba
feat: Update llama.cpp
2024-04-29 23:34:55 -04:00
dependabot[bot]
df2b5b5d44
chore(deps): bump actions/upload-artifact from 3 to 4 ( #1412 )
...
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact ) from 3 to 4.
- [Release notes](https://github.com/actions/upload-artifact/releases )
- [Commits](https://github.com/actions/upload-artifact/compare/v3...v4 )
---
updated-dependencies:
- dependency-name: actions/upload-artifact
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-29 22:53:42 -04:00
dependabot[bot]
be43018e09
chore(deps): bump actions/configure-pages from 4 to 5 ( #1411 )
...
Bumps [actions/configure-pages](https://github.com/actions/configure-pages ) from 4 to 5.
- [Release notes](https://github.com/actions/configure-pages/releases )
- [Commits](https://github.com/actions/configure-pages/compare/v4...v5 )
---
updated-dependencies:
- dependency-name: actions/configure-pages
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-29 22:53:21 -04:00
dependabot[bot]
32c000f3ec
chore(deps): bump softprops/action-gh-release from 1 to 2 ( #1408 )
...
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release ) from 1 to 2.
- [Release notes](https://github.com/softprops/action-gh-release/releases )
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md )
- [Commits](https://github.com/softprops/action-gh-release/compare/v1...v2 )
---
updated-dependencies:
- dependency-name: softprops/action-gh-release
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-29 22:52:58 -04:00
Olivier DEBAUCHE
03c654a3d9
ci(fix): Workflow actions updates and fix arm64 wheels not included in release ( #1392 )
...
* Update test.yaml
Bump actions/checkout@v3 to v4
Bump action/setup-python@v4 to v5
* Update test-pypi.yaml
Bum actions/setup-python@v4 to v5
* Update build-and-release.yaml
Bump softprops/action-gh-release@v1 to v2
Bump actions/checkout@v3 to v4
Bump actions/setup-python@v3 to v5
* Update publish.yaml
Bump actions/checkout@v3 to v4
Bump actions/sertup-python@v4 to v5
* Update publish-to-test.yaml
Bump actions/checkout@v3 to v4
Bump actions/setup-python @v4 to v5
* Update test-pypi.yaml
Add Python 3.12
* Update build-and-release.yaml
* Update build-docker.yaml
Bump docker/setup-qemu-action@v2 to v3
Bump docker/setup-buildx-action@v2 to v3
* Update build-and-release.yaml
* Update build-and-release.yaml
2024-04-29 22:52:23 -04:00
Andrei Betlen
0c3bc4b928
fix(ci): Update generate wheel index script to include cu12.3 and cu12.4 Closes #1406
2024-04-29 12:37:22 -04:00
Olivier DEBAUCHE
2355ce2227
ci: Add support for pre-built cuda 12.4.1 wheels ( #1388 )
...
* Add support for cuda 12.4.1
* Update build-wheels-cuda.yaml
* Update build-wheels-cuda.yaml
* Update build-wheels-cuda.yaml
* Update build-wheels-cuda.yaml
* Update build-wheels-cuda.yaml
* Update build-wheels-cuda.yaml
* Update build-wheels-cuda.yaml
* Update build-wheels-cuda.yaml
* Update build-wheels-cuda.yaml
Revert
2024-04-27 23:44:47 -04:00
Andrei Betlen
a411612b38
feat: Add support for str type kv_overrides
2024-04-27 23:42:19 -04:00
Andrei Betlen
c9b85bf098
feat: Update llama.cpp
2024-04-27 23:41:54 -04:00
dependabot[bot]
c07db99e5b
chore(deps): bump pypa/cibuildwheel from 2.16.5 to 2.17.0 ( #1401 )
...
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel ) from 2.16.5 to 2.17.0.
- [Release notes](https://github.com/pypa/cibuildwheel/releases )
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md )
- [Commits](https://github.com/pypa/cibuildwheel/compare/v2.16.5...v2.17.0 )
---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-27 20:51:13 -04:00
dependabot[bot]
7074c4d256
chore(deps): bump docker/build-push-action from 4 to 5 ( #1400 )
...
Bumps [docker/build-push-action](https://github.com/docker/build-push-action ) from 4 to 5.
- [Release notes](https://github.com/docker/build-push-action/releases )
- [Commits](https://github.com/docker/build-push-action/compare/v4...v5 )
---
updated-dependencies:
- dependency-name: docker/build-push-action
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-27 20:51:02 -04:00
dependabot[bot]
79318ba1d1
chore(deps): bump docker/login-action from 2 to 3 ( #1399 )
...
Bumps [docker/login-action](https://github.com/docker/login-action ) from 2 to 3.
- [Release notes](https://github.com/docker/login-action/releases )
- [Commits](https://github.com/docker/login-action/compare/v2...v3 )
---
updated-dependencies:
- dependency-name: docker/login-action
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-27 20:50:50 -04:00
dependabot[bot]
27038db3d6
chore(deps): bump actions/cache from 3.3.2 to 4.0.2 ( #1398 )
...
Bumps [actions/cache](https://github.com/actions/cache ) from 3.3.2 to 4.0.2.
- [Release notes](https://github.com/actions/cache/releases )
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md )
- [Commits](https://github.com/actions/cache/compare/v3.3.2...v4.0.2 )
---
updated-dependencies:
- dependency-name: actions/cache
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-27 20:50:39 -04:00
dependabot[bot]
17bdfc818f
chore(deps): bump conda-incubator/setup-miniconda from 2.2.0 to 3.0.4 ( #1397 )
...
Bumps [conda-incubator/setup-miniconda](https://github.com/conda-incubator/setup-miniconda ) from 2.2.0 to 3.0.4.
- [Release notes](https://github.com/conda-incubator/setup-miniconda/releases )
- [Changelog](https://github.com/conda-incubator/setup-miniconda/blob/main/CHANGELOG.md )
- [Commits](https://github.com/conda-incubator/setup-miniconda/compare/v2.2.0...v3.0.4 )
---
updated-dependencies:
- dependency-name: conda-incubator/setup-miniconda
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-27 20:50:28 -04:00
Jeffrey Fong
f178636e1b
fix: Functionary bug fixes ( #1385 )
...
* fix completion tokens tracking, prompt forming
* fix 'function_call' and 'tool_calls' depending on 'functions' and 'tools', incompatibility with python 3.8
* Updated README
* fix for openai server compatibility
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-04-27 20:49:52 -04:00
iyubondyrev
e6bbfb863c
examples: fix quantize example ( #1387 )
...
@iyubondyrev thank you!
2024-04-27 20:48:47 -04:00
Olivier DEBAUCHE
c58b56123d
ci: Update action versions in build-wheels-metal.yaml ( #1390 )
...
* Bump actions/setup-python@v4 to v5
* Update build-wheels-metal.yaml
* Update build-wheels-metal.yaml
* Update build-wheels-metal.yaml
2024-04-27 20:47:49 -04:00
Olivier DEBAUCHE
9e7f738220
ci: Update dependabot.yml ( #1391 )
...
Add github-actions update
2024-04-27 20:47:07 -04:00
Andrei Betlen
65edc90671
chore: Bump version
2024-04-26 10:11:31 -04:00
Andrei Betlen
173ebc7878
fix: Remove duplicate pooling_type definition and add misisng n_vocab definition in bindings
2024-04-25 21:36:09 -04:00
Douglas Hanley
f6ed21f9a2
feat: Allow for possibly non-pooled embeddings ( #1380 )
...
* allow for possibly non-pooled embeddings
* add more to embeddings section in README.md
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-04-25 21:32:44 -04:00
Andrei Betlen
fcfea66857
fix: pydantic deprecation warning
2024-04-25 21:21:48 -04:00
Andrei Betlen
7f52335c50
feat: Update llama.cpp
2024-04-25 21:21:29 -04:00
Andrei Betlen
266abfc1a3
fix(ci): Fix metal tests as well
2024-04-25 03:09:46 -04:00
Andrei Betlen
de37420fcf
fix(ci): Fix python macos test runners issue
2024-04-25 03:08:32 -04:00
Andrei Betlen
2a9979fce1
feat: Update llama.cpp
2024-04-25 02:48:26 -04:00
Andrei Betlen
c50d3300d2
chore: Bump version
2024-04-23 02:53:20 -04:00