Aditya Purandare
52d9d70076
docs: Update README.md to fix pip install llama cpp server ( #1187 )
...
Without the single quotes, when running the command, an error is printed saying no matching packages found on pypi. Adding the quotes fixes it
```bash
$ pip install llama-cpp-python[server]
zsh: no matches found: llama-cpp-python[server]
```
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-23 04:41:22 -05:00
Andrei Betlen
410e02da51
docs: Fix typo
2024-02-23 00:43:31 -05:00
Andrei Betlen
eb56ce2e2a
docs: fix low-level api example
2024-02-22 11:33:05 -05:00
Andrei Betlen
0f8cad6cb7
docs: Update README
2024-02-22 11:31:44 -05:00
Andrei Betlen
045cc12670
docs: Update README
2024-02-22 03:53:52 -05:00
Andrei Betlen
32efed7b07
docs: Update README
2024-02-22 03:25:11 -05:00
Andrei Betlen
d80c5cf29d
docs: fix indentation for mkdocs-material
2024-02-22 02:30:24 -05:00
Andrei
0f8aa4ab5c
feat: Pull models directly from huggingface ( #1206 )
...
* Add from_pretrained method to Llama class
* Update docs
* Merge filename and pattern
2024-02-21 16:25:10 -05:00
Andrei Betlen
c2a234a086
docs: Add embeddings section
2024-02-15 23:15:50 -05:00
Andrei Betlen
4348a6cdf0
docs: Fix typo
2024-02-13 02:04:54 -05:00
Andrei Betlen
b82b0e1014
docs: Temporarily revert function calling docs
2024-02-12 16:27:43 -05:00
Akarshan Biswas
918ff27e50
docs: Set the correct command for compiling with syscl support ( #1172 )
2024-02-11 13:55:15 -05:00
Jeffrey Fong
901827013b
feat: Integrate functionary v1.4 and v2 models + add custom tokenizer support to Llama class ( #1078 )
...
* convert functionary-v1 chat handler to use hf autotokenizer
* add hf_tokenizer + inteegrate functionary-v1.4 prompt template
* integrate functionary v2 prompt template
* update readme
* set up parallel function calling wip
* set up parallel function calling
* Update README.md
* Update README.md
* refactor tokenizers
* include old functionary handler for backward compatibility
* add hf_tokenizer_path in server ModelSettings
* convert functionary-v1 chat handler to use hf autotokenizer
* add hf_tokenizer + inteegrate functionary-v1.4 prompt template
* integrate functionary v2 prompt template
* update readme
* set up parallel function calling wip
* resolve merge conflict
* Update README.md
* Update README.md
* refactor tokenizers
* include old functionary handler for backward compatibility
* add hf_tokenizer_path in server ModelSettings
* Cleanup PR, fix breaking changes
* Use hf_pretrained_model_name_or_path for tokenizer
* fix hf tokenizer in streaming
* update README
* refactor offset mapping
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-07 20:07:03 -05:00
Andrei
fb762a6041
Add speculative decoding ( #1120 )
...
* Add draft model param to llama class, implement basic prompt lookup decoding draft model
* Use samplingcontext for sampling
* Use 1d array
* Use draft model for sampling
* Fix dumb mistake
* Allow for later extensions to the LlamaDraftModel api
* Cleanup
* Adaptive candidate prediction
* Update implementation to match hf transformers
* Tuning
* Fix bug where last token was not used for ngram prediction
* Remove heuristic for num_pred_tokens (no benefit)
* fix: n_candidates bug.
* Add draft_model_num_pred_tokens server setting
* Cleanup
* Update README
2024-01-31 14:08:14 -05:00
Andrei Betlen
247a16de66
docs: Update README
2024-01-30 12:23:07 -05:00
Andrei Betlen
059f6b3ac8
docs: fix typos
2024-01-29 11:02:25 -05:00
Andrei Betlen
843e77e3e2
docs: Add Vulkan build instructions
2024-01-29 11:01:26 -05:00
Andrei Betlen
8c59210062
docs: Fix typo
2024-01-27 19:37:59 -05:00
Andrei Betlen
399fa1e03b
docs: Add JSON and JSON schema mode examples to README
2024-01-27 19:36:33 -05:00
Andrei Betlen
d6fb16e055
docs: Update README
2024-01-25 10:51:48 -05:00
Andrei Betlen
5b258bf840
docs: Update README with more param common examples
2024-01-24 10:51:15 -05:00
Andrei Betlen
88fbccaaa3
docs: Add macosx wrong arch fix to README
2024-01-21 18:38:44 -05:00
Jerry Liu
84380fe9a6
Add llamaindex integration to readme ( #1092 )
2024-01-16 19:10:50 -05:00
Caleb Hoff
f766b70c9a
Fix: Correct typo in README.md ( #1058 )
...
In Llama.create_chat_completion, the `tool_choice` property does not have an s on the end.
2024-01-04 18:12:32 -05:00
Andrei Betlen
f4be84c122
Fix typo
2023-12-22 14:40:44 -05:00
Andrei Betlen
9b3a5939f3
docs: Add multi-model link to readme
2023-12-22 14:40:13 -05:00
evelynmitchell
37da8e863a
Update README.md functionary demo typo ( #996 )
...
missing comma
2023-12-16 19:00:30 -05:00
zocainViken
6bbeea07ae
README.md multimodal params fix ( #967 )
...
multi modal params fix: add logits = True -> to make llava work
2023-12-11 20:41:38 -05:00
Aniket Maurya
c1d92ce680
fix minor typo ( #958 )
...
* fix minor typo
* Fix typo
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2023-12-11 20:40:38 -05:00
Andrei Betlen
fb32f9d438
docs: Update README
2023-11-28 03:15:01 -05:00
Andrei Betlen
43e006a291
docs: Remove divider
2023-11-28 02:41:50 -05:00
Andrei Betlen
2cc6c9ae2f
docs: Update README, add FAQ
2023-11-28 02:37:34 -05:00
Andrei Betlen
9c68b1804a
docs: Add api reference links in README
2023-11-27 18:54:07 -05:00
Andrei Betlen
41428244f0
docs: Fix README indentation
2023-11-27 18:29:13 -05:00
Andrei Betlen
1539146a5e
docs: Fix README docs link
2023-11-27 18:21:00 -05:00
Anton Vice
aa5a7a1880
Update README.md ( #940 )
...
.ccp >> .cpp
2023-11-26 15:39:38 -05:00
Andrei Betlen
abb1976ad7
docs: Add n_ctx not for multimodal models
2023-11-22 21:07:00 -05:00
Andrei Betlen
36679a58ef
Merge branch 'main' of github.com:abetlen/llama_cpp_python into main
2023-11-22 19:49:59 -05:00
Andrei Betlen
bd43fb2bfe
docs: Update high-level python api examples in README to include chat formats, function calling, and multi-modal models.
2023-11-22 19:49:56 -05:00
Andrei Betlen
d977b44d82
docs: Add links to server functionality
2023-11-22 18:21:02 -05:00
Andrei Betlen
aa815d580c
docs: Link to langchain docs
2023-11-22 18:17:49 -05:00
Andrei Betlen
602ea64ddd
docs: Fix whitespace
2023-11-22 18:09:31 -05:00
Andrei Betlen
f336eebb2f
docs: fix 404 to macos installation guide. Closes #861
2023-11-22 18:07:30 -05:00
Andrei Betlen
1ff2c92720
docs: minor indentation fix
2023-11-22 18:04:18 -05:00
Andrei Betlen
68238b7883
docs: setting n_gqa is no longer required
2023-11-22 18:01:54 -05:00
Andrei Betlen
198178225c
docs: Remove stale warning
2023-11-22 17:59:16 -05:00
Juraj Bednar
5a9770a56b
Improve documentation for server chat formats ( #934 )
2023-11-22 06:10:03 -05:00
James Braza
23a221999f
Documenting server usage ( #768 )
2023-11-21 00:24:22 -05:00
Sujeendran Menon
7b136bb5b1
Fix for shared library not found and compile issues in Windows ( #848 )
...
* fix windows library dll name issue
* Updated README.md Windows instructions
* Update llama_cpp.py to handle different windows dll file versions
2023-11-01 18:55:57 -04:00
Jason Cox
40b22909dc
Update examples from ggml to gguf and add hw-accel note for Web Server ( #688 )
...
* Examples from ggml to gguf
* Use gguf file extension
Update examples to use filenames with gguf extension (e.g. llama-model.gguf).
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2023-09-14 14:48:21 -04:00