Commit graph

  • e9b337b312
    Merge https://github.com/abetlen/llama-cpp-python master baalajimaestro 2024-06-25 06:55:33 +05:30
  • 04959f1884 feat: Update llama_cpp.py bindings Andrei Betlen 2024-06-21 16:56:15 -04:00
  • 35c980eb2e
    chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.1 (#1527) dependabot[bot] 2024-06-21 12:10:43 -04:00
  • 398fe81547
    chore(deps): bump docker/build-push-action from 5 to 6 (#1539) dependabot[bot] 2024-06-21 12:10:34 -04:00
  • 27d53589ff
    docs: Update readme examples to use newer Qwen2 model (#1544) Jon Craton 2024-06-21 12:10:15 -04:00
  • 5beec1a1fd feat: Update llama.cpp Andrei Betlen 2024-06-21 12:09:14 -04:00
  • d98a24a25b docs: Remove references to deprecated opencl backend. Closes #1512 Andrei Betlen 2024-06-20 10:50:40 -04:00
  • 6c331909ca chore: Bump version Andrei Betlen 2024-06-19 10:10:01 -04:00
  • 554fd08e7d feat: Update llama.cpp Andrei Betlen 2024-06-19 10:07:28 -04:00
  • 4c1d74c0ae fix: Make destructor to automatically call .close() method on Llama class. Andrei Betlen 2024-06-19 10:07:20 -04:00
  • f4491c4903 feat: Update llama.cpp Andrei Betlen 2024-06-17 11:56:03 -04:00
  • 5f5ea0a49c
    Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-06-15 10:16:33 +05:30
  • 8401c6f2d1 feat: Update llama.cpp Andrei Betlen 2024-06-13 11:31:31 -04:00
  • 9e396b3ebd
    feat: Update workflows and pre-built wheels (#1416) Olivier DEBAUCHE 2024-06-13 16:19:57 +02:00
  • 5af81634cb
    chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.0 (#1522) dependabot[bot] 2024-06-13 10:12:02 -04:00
  • 320a5d7ea5
    feat: Add .close() method to Llama class to explicitly free model from memory (#1513) Junpei Kawamoto 2024-06-13 02:16:14 -06:00
  • dbcf64cf07
    feat: Support SPM infill (#1492) Sigbjørn Skjæret 2024-06-13 09:45:24 +02:00
  • e342161371 feat: Update llama.cpp Andrei Betlen 2024-06-13 03:38:11 -04:00
  • 64058abaa0
    Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-06-12 06:54:43 +05:30
  • 86a38ad4a0 chore: Bump version Andrei Betlen 2024-06-10 11:14:33 -04:00
  • 1615eb9e5b feat: Update llama.cpp Andrei Betlen 2024-06-10 11:05:45 -04:00
  • c2e4d5820a
    Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-06-09 10:46:41 +05:30
  • 83d6b26e6f feat: Update llama.cpp Andrei Betlen 2024-06-08 23:14:22 -04:00
  • 255e1b4495 feat: Update llama.cpp Andrei Betlen 2024-06-07 02:02:12 -04:00
  • d634efcdd9
    feat: adding rpc_servers parameter to Llama class (#1477) nullname 2024-06-04 22:38:21 +08:00
  • 6e0642ca19
    fix: fix logprobs when BOS is not present (#1471) Asghar Ghorbani 2024-06-04 16:18:38 +02:00
  • 027f7bc678
    fix: Avoid duplicate special tokens in chat formats (#1439) Sigbjørn Skjæret 2024-06-04 16:15:41 +02:00
  • 951e39caf9 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-06-04 00:49:26 -04:00
  • c3ef41ba06 chore: Bump version Andrei Betlen 2024-06-04 00:49:24 -04:00
  • ae5682f500
    fix: Disable Windows+CUDA workaround when compiling for HIPBLAS (#1493) Engininja2 2024-06-03 22:42:34 -06:00
  • cd3f1bb387 feat: Update llama.cpp Andrei Betlen 2024-06-04 00:35:47 -04:00
  • 6b018e00b1 misc: Improve llava error messages Andrei Betlen 2024-06-03 11:19:10 -04:00
  • a6457ba74b Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-06-01 18:10:13 -04:00
  • af3ed503e9 fix: Use numpy recarray for candidates data, fixes bug with temp < 0 Andrei Betlen 2024-06-01 18:09:24 -04:00
  • fc2af04c15
    Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-05-30 08:20:49 +05:30
  • 165b4dc6c1 fix: Fix typo in Llama3VisionAlphaChatHandler. Closes #1488 Andrei Betlen 2024-05-29 02:29:44 -04:00
  • 91d05aba46 fix: adjust kv_override member names to match llama.cpp Andrei Betlen 2024-05-29 02:28:58 -04:00
  • df45a4b3fe fix: fix string value kv_overrides. Closes #1487 Andrei Betlen 2024-05-29 02:02:22 -04:00
  • 10b7c50cd2 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-28 22:52:30 -04:00
  • 2907c26906 misc: Update debug build to keep all debug symbols for easier gdb debugging Andrei Betlen 2024-05-28 22:52:28 -04:00
  • c26004b1be feat: Update llama.cpp Andrei Betlen 2024-05-28 22:52:03 -04:00
  • c564007ff6
    chore(deps): bump pypa/cibuildwheel from 2.18.0 to 2.18.1 (#1472) dependabot[bot] 2024-05-27 10:57:17 -04:00
  • 454c9bb1cb feat: Update llama.cpp Andrei Betlen 2024-05-27 10:51:57 -04:00
  • 2d89964147 docs: Fix table formatting Andrei Betlen 2024-05-24 11:55:41 -04:00
  • 9e8d7d55bd fix(docs): Fix link typo Andrei Betlen 2024-05-24 11:55:01 -04:00
  • ec43e8920f docs: Update multi-modal model section Andrei Betlen 2024-05-24 11:54:15 -04:00
  • a4c9ab885d chore: Bump version Andrei Betlen 2024-05-24 01:59:25 -04:00
  • 5cae1040e3
    feat: Improve Llama.eval performance by avoiding list conversion (#1476) Linghan Zhong 2024-05-24 00:49:44 -05:00
  • 087cc0b036 feat: Update llama.cpp Andrei Betlen 2024-05-24 01:43:36 -04:00
  • 5a595f035a feat: Update llama.cpp Andrei Betlen 2024-05-22 02:40:31 -04:00
  • 5b4ad6f4d1
    Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-05-20 15:31:44 +05:30
  • 3dbfec74e7 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-18 01:19:20 -04:00
  • d8a3b013c3 feat: Update llama.cpp Andrei Betlen 2024-05-18 01:19:19 -04:00
  • 03f171e810
    example: LLM inference with Ray Serve (#1465) Radoslav Gerganov 2024-05-17 20:27:26 +03:00
  • b564d05806 chore: Bump version Andrei Betlen 2024-05-16 00:41:21 -04:00
  • d99a6ba607 fix: segfault for models without eos / bos tokens. Closes #1463 Andrei Betlen 2024-05-16 00:37:27 -04:00
  • e811a81066 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-15 23:59:18 -04:00
  • ca8e3c967d feat: Update llama.cpp Andrei Betlen 2024-05-15 23:59:17 -04:00
  • 5212fb08ae
    feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333) twaka 2024-05-14 22:50:53 +09:00
  • 389e09c2f5
    misc: Remove unnecessary metadata lookups (#1448) Sigbjørn Skjæret 2024-05-14 15:44:09 +02:00
  • 4b54f79330
    chore(deps): bump pypa/cibuildwheel from 2.17.0 to 2.18.0 (#1453) dependabot[bot] 2024-05-14 09:35:52 -04:00
  • 50f5c74ecf Update llama.cpp Andrei Betlen 2024-05-14 09:30:04 -04:00
  • 43ba1526c8 feat: Update llama.cpp Andrei Betlen 2024-05-13 09:39:08 -04:00
  • 3f8e17af63 fix(ci): Use version without extra platform tag in pep503 index Andrei Betlen 2024-05-12 11:45:55 -04:00
  • 3c19faa0d4 chore: Bump version Andrei Betlen 2024-05-12 10:32:52 -04:00
  • 3fe8e9a8f3 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-12 10:30:24 -04:00
  • 9dc5e20fb6 feat: Update llama.cpp Andrei Betlen 2024-05-12 10:30:23 -04:00
  • 1547202b77
    docs: Fix typo in README.md (#1444) Peng Yu 2024-05-10 10:35:51 -04:00
  • 7f59856fa6 fix: Enable CUDA backend for llava. Closes #1324 Andrei Betlen 2024-05-10 10:18:47 -04:00
  • 73165021bb chore: Bump version Andrei Betlen 2024-05-10 09:44:18 -04:00
  • eafb6ec5e8 feat: Update llama.cpp Andrei Betlen 2024-05-10 08:39:55 -04:00
  • ac55d0a175 fix: Clear kv cache to avoid kv bug when image is evaluated first Andrei Betlen 2024-05-10 02:38:10 -04:00
  • 4badac3a60 chore: Bump version Andrei Betlen 2024-05-10 00:56:19 -04:00
  • 561e880654
    fix(security): Render all jinja templates in immutable sandbox (#1441) Sigbjørn Skjæret 2024-05-10 06:49:40 +02:00
  • b454f40a9a
    Merge pull request from GHSA-56xg-wfcc-g829 Patrick Peng 2024-05-10 12:47:56 +08:00
  • 5ab40e6167
    feat: Support multiple chat templates - step 1 (#1396) Sigbjørn Skjæret 2024-05-09 15:49:09 +02:00
  • bf66a283e8 chore: Bump version Andrei Betlen 2024-05-09 03:02:52 -04:00
  • 3757328b70 fix: free last image embed in llava chat handler Andrei Betlen 2024-05-08 22:16:18 -04:00
  • 77122638b4 fix: Make leading bos_token optional for image chat formats, fix nanollava system message Andrei Betlen 2024-05-08 13:12:31 -04:00
  • 2a39b99575 feat: Update llama.cpp Andrei Betlen 2024-05-08 08:42:22 -04:00
  • 9ce5cb376a chore: Bump version Andrei Betlen 2024-05-08 02:36:42 -04:00
  • 4a7122d22f
    feat: fill-in-middle support (#1386) Sigbjørn Skjæret 2024-05-08 08:26:22 +02:00
  • 228949c1f7 feat: Update llama.cpp Andrei Betlen 2024-05-08 02:22:15 -04:00
  • 903b28adf5
    fix: adding missing args in create_completion for functionary chat handler (#1430) Sarunas Kalade 2024-05-08 07:21:27 +01:00
  • 07966b9ba7
    docs: update README.md (#1432) Ikko Eltociear Ashimine 2024-05-08 15:20:20 +09:00
  • a50d24e3a7
    fix: chat_format log where auto-detected format prints None (#1434) Bruno Alvisio 2024-05-07 23:19:35 -07:00
  • 0318702cdc feat(server): Add support for setting root_path. Closes #1420 Andrei Betlen 2024-05-05 12:49:31 -04:00
  • 3666833107
    feat(ci): Add docker checks and check deps more frequently (#1426) Olivier DEBAUCHE 2024-05-05 18:42:28 +02:00
  • 3e2597eac8 feat: Update llama.cpp Andrei Betlen 2024-05-05 12:12:27 -04:00
  • e0d7674e62
    fix: detokenization case where first token does not start with a leading space (#1375) Noam Gat 2024-05-04 17:14:59 +03:00
  • 1f56c648c3
    feat: Implement streaming for Functionary v2 + Bug fixes (#1419) Jeffrey Fong 2024-05-04 22:11:20 +08:00
  • f9b7221c8f Merge branch 'main' of github.com:abetlen/llama_cpp_python into main Andrei Betlen 2024-05-03 19:07:54 -04:00
  • 9f7a85571a fix: Use memmove to copy str_value kv_override. Closes #1417 Andrei Betlen 2024-05-03 19:07:50 -04:00
  • 0a454bebe6 feat(server): Remove temperature bounds checks for server. Closes #1384 Andrei Betlen 2024-05-03 15:23:06 -04:00
  • 2138561fab
    fix(server): Propagate flash_attn to model load. (#1424) Daniel Thuerck 2024-05-03 18:17:07 +02:00
  • 2117122396 chore: Bump version Andrei Betlen 2024-05-02 12:07:09 -04:00
  • d75dea18db feat: Update llama.cpp Andrei Betlen 2024-05-02 12:00:44 -04:00
  • 31b1d95a6c feat: Add llama-3-vision-alpha chat format Andrei Betlen 2024-05-02 11:32:18 -04:00
  • 1d177aaaef
    Merge https://github.com/abetlen/llama-cpp-python baalajimaestro 2024-05-02 18:13:32 +05:30
  • 4f01c452b6 fix: Change default verbose value of verbose in image chat format handlers to True to match Llama Andrei Betlen 2024-04-30 15:50:30 -04:00