llama.cpp

baalajimaestro/llama.cpp

Fork 0

Commit graph

e9b337b312

Merge https://github.com/abetlen/llama-cpp-python master baalajimaestro 2024-06-25 06:55:33 +05:30
04959f1884 feat: Update llama_cpp.py bindings Andrei Betlen 2024-06-21 16:56:15 -04:00
35c980eb2e

chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.1 (#1527) dependabot[bot] 2024-06-21 12:10:43 -04:00
398fe81547

chore(deps): bump docker/build-push-action from 5 to 6 (#1539) dependabot[bot] 2024-06-21 12:10:34 -04:00
27d53589ff

docs: Update readme examples to use newer Qwen2 model (#1544) Jon Craton 2024-06-21 12:10:15 -04:00
5beec1a1fd feat: Update llama.cpp Andrei Betlen 2024-06-21 12:09:14 -04:00
d98a24a25b docs: Remove references to deprecated opencl backend. Closes #1512 Andrei Betlen 2024-06-20 10:50:40 -04:00
6c331909ca chore: Bump version Andrei Betlen 2024-06-19 10:10:01 -04:00
554fd08e7d feat: Update llama.cpp Andrei Betlen 2024-06-19 10:07:28 -04:00
4c1d74c0ae fix: Make destructor to automatically call .close() method on Llama class. Andrei Betlen 2024-06-19 10:07:20 -04:00
f4491c4903 feat: Update llama.cpp Andrei Betlen 2024-06-17 11:56:03 -04:00
5f5ea0a49c

Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-06-15 10:16:33 +05:30
8401c6f2d1 feat: Update llama.cpp Andrei Betlen 2024-06-13 11:31:31 -04:00
9e396b3ebd

feat: Update workflows and pre-built wheels (#1416) Olivier DEBAUCHE 2024-06-13 16:19:57 +02:00
5af81634cb

chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.0 (#1522) dependabot[bot] 2024-06-13 10:12:02 -04:00
320a5d7ea5

feat: Add .close() method to Llama class to explicitly free model from memory (#1513) Junpei Kawamoto 2024-06-13 02:16:14 -06:00
dbcf64cf07

feat: Support SPM infill (#1492) Sigbjørn Skjæret 2024-06-13 09:45:24 +02:00
e342161371 feat: Update llama.cpp Andrei Betlen 2024-06-13 03:38:11 -04:00
64058abaa0

Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-06-12 06:54:43 +05:30
86a38ad4a0 chore: Bump version Andrei Betlen 2024-06-10 11:14:33 -04:00
1615eb9e5b feat: Update llama.cpp Andrei Betlen 2024-06-10 11:05:45 -04:00
c2e4d5820a

Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-06-09 10:46:41 +05:30
83d6b26e6f feat: Update llama.cpp Andrei Betlen 2024-06-08 23:14:22 -04:00
255e1b4495 feat: Update llama.cpp Andrei Betlen 2024-06-07 02:02:12 -04:00
d634efcdd9

feat: adding rpc_servers parameter to Llama class (#1477) nullname 2024-06-04 22:38:21 +08:00
6e0642ca19

fix: fix logprobs when BOS is not present (#1471) Asghar Ghorbani 2024-06-04 16:18:38 +02:00
027f7bc678

fix: Avoid duplicate special tokens in chat formats (#1439) Sigbjørn Skjæret 2024-06-04 16:15:41 +02:00
951e39caf9 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-06-04 00:49:26 -04:00
c3ef41ba06 chore: Bump version Andrei Betlen 2024-06-04 00:49:24 -04:00
ae5682f500

fix: Disable Windows+CUDA workaround when compiling for HIPBLAS (#1493) Engininja2 2024-06-03 22:42:34 -06:00
cd3f1bb387 feat: Update llama.cpp Andrei Betlen 2024-06-04 00:35:47 -04:00
6b018e00b1 misc: Improve llava error messages Andrei Betlen 2024-06-03 11:19:10 -04:00
a6457ba74b Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-06-01 18:10:13 -04:00
af3ed503e9 fix: Use numpy recarray for candidates data, fixes bug with temp < 0 Andrei Betlen 2024-06-01 18:09:24 -04:00
fc2af04c15

Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-05-30 08:20:49 +05:30
165b4dc6c1 fix: Fix typo in Llama3VisionAlphaChatHandler. Closes #1488 Andrei Betlen 2024-05-29 02:29:44 -04:00
91d05aba46 fix: adjust kv_override member names to match llama.cpp Andrei Betlen 2024-05-29 02:28:58 -04:00
df45a4b3fe fix: fix string value kv_overrides. Closes #1487 Andrei Betlen 2024-05-29 02:02:22 -04:00
10b7c50cd2 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-28 22:52:30 -04:00
2907c26906 misc: Update debug build to keep all debug symbols for easier gdb debugging Andrei Betlen 2024-05-28 22:52:28 -04:00
c26004b1be feat: Update llama.cpp Andrei Betlen 2024-05-28 22:52:03 -04:00
c564007ff6

chore(deps): bump pypa/cibuildwheel from 2.18.0 to 2.18.1 (#1472) dependabot[bot] 2024-05-27 10:57:17 -04:00
454c9bb1cb feat: Update llama.cpp Andrei Betlen 2024-05-27 10:51:57 -04:00
2d89964147 docs: Fix table formatting Andrei Betlen 2024-05-24 11:55:41 -04:00
9e8d7d55bd fix(docs): Fix link typo Andrei Betlen 2024-05-24 11:55:01 -04:00
ec43e8920f docs: Update multi-modal model section Andrei Betlen 2024-05-24 11:54:15 -04:00
a4c9ab885d chore: Bump version Andrei Betlen 2024-05-24 01:59:25 -04:00
5cae1040e3

feat: Improve Llama.eval performance by avoiding list conversion (#1476) Linghan Zhong 2024-05-24 00:49:44 -05:00
087cc0b036 feat: Update llama.cpp Andrei Betlen 2024-05-24 01:43:36 -04:00
5a595f035a feat: Update llama.cpp Andrei Betlen 2024-05-22 02:40:31 -04:00
5b4ad6f4d1

Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-05-20 15:31:44 +05:30
3dbfec74e7 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-18 01:19:20 -04:00
d8a3b013c3 feat: Update llama.cpp Andrei Betlen 2024-05-18 01:19:19 -04:00
03f171e810

example: LLM inference with Ray Serve (#1465) Radoslav Gerganov 2024-05-17 20:27:26 +03:00
b564d05806 chore: Bump version Andrei Betlen 2024-05-16 00:41:21 -04:00
d99a6ba607 fix: segfault for models without eos / bos tokens. Closes #1463 Andrei Betlen 2024-05-16 00:37:27 -04:00
e811a81066 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-15 23:59:18 -04:00
ca8e3c967d feat: Update llama.cpp Andrei Betlen 2024-05-15 23:59:17 -04:00
5212fb08ae

feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333) twaka 2024-05-14 22:50:53 +09:00
389e09c2f5

misc: Remove unnecessary metadata lookups (#1448) Sigbjørn Skjæret 2024-05-14 15:44:09 +02:00
4b54f79330

chore(deps): bump pypa/cibuildwheel from 2.17.0 to 2.18.0 (#1453) dependabot[bot] 2024-05-14 09:35:52 -04:00
50f5c74ecf Update llama.cpp Andrei Betlen 2024-05-14 09:30:04 -04:00
43ba1526c8 feat: Update llama.cpp Andrei Betlen 2024-05-13 09:39:08 -04:00
3f8e17af63 fix(ci): Use version without extra platform tag in pep503 index Andrei Betlen 2024-05-12 11:45:55 -04:00
3c19faa0d4 chore: Bump version Andrei Betlen 2024-05-12 10:32:52 -04:00
3fe8e9a8f3 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-12 10:30:24 -04:00
9dc5e20fb6 feat: Update llama.cpp Andrei Betlen 2024-05-12 10:30:23 -04:00
1547202b77

docs: Fix typo in README.md (#1444) Peng Yu 2024-05-10 10:35:51 -04:00
7f59856fa6 fix: Enable CUDA backend for llava. Closes #1324 Andrei Betlen 2024-05-10 10:18:47 -04:00
73165021bb chore: Bump version Andrei Betlen 2024-05-10 09:44:18 -04:00
eafb6ec5e8 feat: Update llama.cpp Andrei Betlen 2024-05-10 08:39:55 -04:00
ac55d0a175 fix: Clear kv cache to avoid kv bug when image is evaluated first Andrei Betlen 2024-05-10 02:38:10 -04:00
4badac3a60 chore: Bump version Andrei Betlen 2024-05-10 00:56:19 -04:00
561e880654

fix(security): Render all jinja templates in immutable sandbox (#1441) Sigbjørn Skjæret 2024-05-10 06:49:40 +02:00
b454f40a9a

Merge pull request from GHSA-56xg-wfcc-g829 Patrick Peng 2024-05-10 12:47:56 +08:00
5ab40e6167

feat: Support multiple chat templates - step 1 (#1396) Sigbjørn Skjæret 2024-05-09 15:49:09 +02:00
bf66a283e8 chore: Bump version Andrei Betlen 2024-05-09 03:02:52 -04:00
3757328b70 fix: free last image embed in llava chat handler Andrei Betlen 2024-05-08 22:16:18 -04:00
77122638b4 fix: Make leading bos_token optional for image chat formats, fix nanollava system message Andrei Betlen 2024-05-08 13:12:31 -04:00
2a39b99575 feat: Update llama.cpp Andrei Betlen 2024-05-08 08:42:22 -04:00
9ce5cb376a chore: Bump version Andrei Betlen 2024-05-08 02:36:42 -04:00
4a7122d22f

feat: fill-in-middle support (#1386) Sigbjørn Skjæret 2024-05-08 08:26:22 +02:00
228949c1f7 feat: Update llama.cpp Andrei Betlen 2024-05-08 02:22:15 -04:00
903b28adf5

fix: adding missing args in create_completion for functionary chat handler (#1430) Sarunas Kalade 2024-05-08 07:21:27 +01:00
07966b9ba7

docs: update README.md (#1432) Ikko Eltociear Ashimine 2024-05-08 15:20:20 +09:00
a50d24e3a7

fix: chat_format log where auto-detected format prints None (#1434) Bruno Alvisio 2024-05-07 23:19:35 -07:00
0318702cdc feat(server): Add support for setting root_path. Closes #1420 Andrei Betlen 2024-05-05 12:49:31 -04:00
3666833107

feat(ci): Add docker checks and check deps more frequently (#1426) Olivier DEBAUCHE 2024-05-05 18:42:28 +02:00
3e2597eac8 feat: Update llama.cpp Andrei Betlen 2024-05-05 12:12:27 -04:00
e0d7674e62

fix: detokenization case where first token does not start with a leading space (#1375) Noam Gat 2024-05-04 17:14:59 +03:00
1f56c648c3

feat: Implement streaming for Functionary v2 + Bug fixes (#1419) Jeffrey Fong 2024-05-04 22:11:20 +08:00
f9b7221c8f Merge branch 'main' of github.com:abetlen/llama_cpp_python into main Andrei Betlen 2024-05-03 19:07:54 -04:00
9f7a85571a fix: Use memmove to copy str_value kv_override. Closes #1417 Andrei Betlen 2024-05-03 19:07:50 -04:00
0a454bebe6 feat(server): Remove temperature bounds checks for server. Closes #1384 Andrei Betlen 2024-05-03 15:23:06 -04:00
2138561fab

fix(server): Propagate flash_attn to model load. (#1424) Daniel Thuerck 2024-05-03 18:17:07 +02:00
2117122396 chore: Bump version Andrei Betlen 2024-05-02 12:07:09 -04:00
d75dea18db feat: Update llama.cpp Andrei Betlen 2024-05-02 12:00:44 -04:00
31b1d95a6c feat: Add llama-3-vision-alpha chat format Andrei Betlen 2024-05-02 11:32:18 -04:00
1d177aaaef

Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-05-02 18:13:32 +05:30
4f01c452b6 fix: Change default verbose value of verbose in image chat format handlers to True to match Llama Andrei Betlen 2024-04-30 15:50:30 -04:00