baalajimaestro/llama.cpp

Author	SHA1	Message	Date
dependabot[bot]	17bdfc818f	chore(deps): bump conda-incubator/setup-miniconda from 2.2.0 to 3.0.4 (#1397 ) Bumps [conda-incubator/setup-miniconda](https://github.com/conda-incubator/setup-miniconda) from 2.2.0 to 3.0.4. - [Release notes](https://github.com/conda-incubator/setup-miniconda/releases) - [Changelog](https://github.com/conda-incubator/setup-miniconda/blob/main/CHANGELOG.md) - [Commits](https://github.com/conda-incubator/setup-miniconda/compare/v2.2.0...v3.0.4) --- updated-dependencies: - dependency-name: conda-incubator/setup-miniconda dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-04-27 20:50:28 -04:00
Jeffrey Fong	f178636e1b	fix: Functionary bug fixes (#1385 ) * fix completion tokens tracking, prompt forming * fix 'function_call' and 'tool_calls' depending on 'functions' and 'tools', incompatibility with python 3.8 * Updated README * fix for openai server compatibility --------- Co-authored-by: Andrei <abetlen@gmail.com>	2024-04-27 20:49:52 -04:00
iyubondyrev	e6bbfb863c	examples: fix quantize example (#1387 ) @iyubondyrev thank you!	2024-04-27 20:48:47 -04:00
Olivier DEBAUCHE	c58b56123d	ci: Update action versions in build-wheels-metal.yaml (#1390 ) * Bump actions/setup-python@v4 to v5 * Update build-wheels-metal.yaml * Update build-wheels-metal.yaml * Update build-wheels-metal.yaml	2024-04-27 20:47:49 -04:00
Olivier DEBAUCHE	9e7f738220	ci: Update dependabot.yml (#1391 ) Add github-actions update	2024-04-27 20:47:07 -04:00
Andrei Betlen	65edc90671	chore: Bump version	2024-04-26 10:11:31 -04:00
Andrei Betlen	173ebc7878	fix: Remove duplicate pooling_type definition and add misisng n_vocab definition in bindings	2024-04-25 21:36:09 -04:00
Douglas Hanley	f6ed21f9a2	feat: Allow for possibly non-pooled embeddings (#1380 ) * allow for possibly non-pooled embeddings * add more to embeddings section in README.md --------- Co-authored-by: Andrei <abetlen@gmail.com>	2024-04-25 21:32:44 -04:00
Andrei Betlen	fcfea66857	fix: pydantic deprecation warning	2024-04-25 21:21:48 -04:00
Andrei Betlen	7f52335c50	feat: Update llama.cpp	2024-04-25 21:21:29 -04:00
Andrei Betlen	266abfc1a3	fix(ci): Fix metal tests as well	2024-04-25 03:09:46 -04:00
Andrei Betlen	de37420fcf	fix(ci): Fix python macos test runners issue	2024-04-25 03:08:32 -04:00
Andrei Betlen	2a9979fce1	feat: Update llama.cpp	2024-04-25 02:48:26 -04:00
Andrei Betlen	c50d3300d2	chore: Bump version	2024-04-23 02:53:20 -04:00
Andrei Betlen	611781f531	ci: Build arm64 wheels. Closes #1342	2024-04-23 02:48:09 -04:00
Sean Bailey	53ebcc8bb5	feat(server): Provide ability to dynamically allocate all threads if desired using `-1` (#1364 )	2024-04-23 02:35:38 -04:00
Geza Velkey	507c1da066	fix: Update scikit-build-core build dependency avoid bug in 0.9.1 (#1370 ) cmake [options] <path-to-source> cmake [options] <path-to-existing-build> cmake [options] -S <path-to-source> -B <path-to-build> Specify a source directory to (re-)generate a build system for it in the current working directory. Specify an existing build directory to re-generate its build system. Run 'cmake --help' for more information. issue	2024-04-23 02:34:15 -04:00
abk16	8559e8ce88	feat: Add Llama-3 chat format (#1371 ) * feat: Add Llama-3 chat format * feat: Auto-detect Llama-3 chat format from gguf template * feat: Update llama.cpp to b2715 Includes proper Llama-3 <\|eot_id\|> token handling. --------- Co-authored-by: Andrei Betlen <abetlen@gmail.com>	2024-04-23 02:33:29 -04:00
Andrei Betlen	617d536e1c	feat: Update llama.cpp	2024-04-23 02:31:40 -04:00
Andrei Betlen	d40a250ef3	feat: Use new llama_token_is_eog in create_completions	2024-04-22 00:35:47 -04:00
Andrei Betlen	b21ba0e2ac	Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main	2024-04-21 20:46:42 -04:00
Andrei Betlen	159cc4e5d9	feat: Update llama.cpp	2024-04-21 20:46:40 -04:00
Andrei Betlen	0281214863	chore: Bump version	2024-04-20 00:09:37 -04:00
Andrei Betlen	cc81afebf0	feat: Add stopping_criteria to ChatFormatter, allow stopping on arbitrary token ids, fixes llama3 instruct	2024-04-20 00:00:53 -04:00
Andrei Betlen	d17c1887a3	feat: Update llama.cpp	2024-04-19 23:58:16 -04:00
Andrei Betlen	893a27a736	chore: Bump version	2024-04-18 01:43:39 -04:00
Andrei Betlen	a128c80500	feat: Update llama.cpp	2024-04-18 01:39:45 -04:00
Lucca Zenóbio	4f42664955	feat: update grammar schema converter to match llama.cpp (#1353 ) * feat: improve function calling * feat:grammar * fix * fix * fix	2024-04-18 01:36:25 -04:00
Andrei Betlen	fa4bb0cf81	Revert "feat: Update json to grammar (#1350 )" This reverts commit `610a592f70`.	2024-04-17 16:18:16 -04:00
Lucca Zenóbio	610a592f70	feat: Update json to grammar (#1350 ) * feat: improve function calling * feat:grammar	2024-04-17 10:10:21 -04:00
khimaros	b73c73c0c6	feat: add `disable_ping_events` flag (#1257 ) for backward compatibility, this is false by default it can be set to true to disable EventSource pings which are not supported by some OpenAI clients. fixes https://github.com/abetlen/llama-cpp-python/issues/1256	2024-04-17 10:08:19 -04:00
tc-wolf	4924455dec	feat: Make saved state more compact on-disk (#1296 ) * State load/save changes - Only store up to `n_tokens` logits instead of full `(n_ctx, n_vocab)` sized array. - Difference between ~350MB and ~1500MB for example prompt with ~300 tokens (makes sense lol) - Auto-formatting changes * Back out formatting changes	2024-04-17 10:06:50 -04:00
Andrei Betlen	9842cbf99d	feat: Update llama.cpp	2024-04-17 10:06:15 -04:00
ddh0	c96b2daebf	feat: Use all available CPUs for batch processing (#1345 )	2024-04-17 10:05:54 -04:00
Andrei Betlen	a420f9608b	feat: Update llama.cpp	2024-04-14 19:14:09 -04:00
Andrei Betlen	90dceaba8a	feat: Update llama.cpp	2024-04-14 11:35:57 -04:00
Andrei Betlen	2e9ffd28fd	feat: Update llama.cpp	2024-04-12 21:09:12 -04:00
Andrei Betlen	ef29235d45	chore: Bump version	2024-04-10 03:44:46 -04:00
Andrei Betlen	bb65b4d764	fix: pass correct type to chat handlers for chat completion logprobs	2024-04-10 03:41:55 -04:00
Andrei Betlen	060bfa64d5	feat: Add support for yaml based configs	2024-04-10 02:47:01 -04:00
Andrei Betlen	1347e1d050	feat: Add typechecking for ctypes structure attributes	2024-04-10 02:40:41 -04:00
Andrei Betlen	889d0e8981	feat: Update llama.cpp	2024-04-10 02:25:58 -04:00
Andrei Betlen	56071c956a	feat: Update llama.cpp	2024-04-09 09:53:49 -04:00
Andrei Betlen	08b16afe11	chore: Bump version	2024-04-06 01:53:38 -04:00
Andrei Betlen	7ca364c8bd	feat: Update llama.cpp	2024-04-06 01:37:43 -04:00
Andrei Betlen	b3bfea6dbf	fix: Always embed metal library. Closes #1332	2024-04-06 01:36:53 -04:00
Andrei Betlen	f4092e6b46	feat: Update llama.cpp	2024-04-05 10:59:31 -04:00
Andrei Betlen	2760ef6156	Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main	2024-04-05 10:51:54 -04:00
Andrei Betlen	1ae3abbcc3	fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 Closes #1314	2024-04-05 10:51:44 -04:00
Andrei Betlen	49bc66bfa2	fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 #1314	2024-04-05 10:50:49 -04:00

1 2 3 4 5 ...

1669 commits