This commit is contained in:
Andrei Betlen 2024-02-23 12:34:03 -05:00
commit f94faab686
2 changed files with 6 additions and 4 deletions

View file

@ -76,12 +76,14 @@ Function calling is completely compatible with the OpenAI function calling API a
You'll first need to download one of the available function calling models in GGUF format: You'll first need to download one of the available function calling models in GGUF format:
- [functionary-7b-v1](https://huggingface.co/abetlen/functionary-7b-v1-GGUF) - [functionary](https://huggingface.co/meetkai)
Then when you run the server you'll need to also specify the `functionary` chat_format Then when you run the server you'll need to also specify either `functionary-v1` or `functionary-v2` chat_format.
Note that since functionary requires a HF Tokenizer due to discrepancies between llama.cpp and HuggingFace's tokenizers as mentioned [here](https://github.com/abetlen/llama-cpp-python/blob/main?tab=readme-ov-file#function-calling), you will need to pass in the path to the tokenizer too. The tokenizer files are already included in the respective HF repositories hosting the gguf files.
```bash ```bash
python3 -m llama_cpp.server --model <model_path> --chat_format functionary python3 -m llama_cpp.server --model <model_path_to_functionary_v2_model> --chat_format functionary-v2 --hf_pretrained_model_name_or_path <model_path_to_functionary_v2_tokenizer>
``` ```
Check out this [example notebook](https://github.com/abetlen/llama-cpp-python/blob/main/examples/notebooks/Functions.ipynb) for a walkthrough of some interesting use cases for function calling. Check out this [example notebook](https://github.com/abetlen/llama-cpp-python/blob/main/examples/notebooks/Functions.ipynb) for a walkthrough of some interesting use cases for function calling.

View file

@ -9,7 +9,7 @@
"The OpenAI compatbile web server in `llama-cpp-python` supports function calling.\n", "The OpenAI compatbile web server in `llama-cpp-python` supports function calling.\n",
"\n", "\n",
"Function calling allows API clients to specify a schema that gives the model a format it should respond in.\n", "Function calling allows API clients to specify a schema that gives the model a format it should respond in.\n",
"Function calling in `llama-cpp-python` works by combining models pretrained for function calling such as [`functionary`](https://huggingface.co/abetlen/functionary-7b-v1-GGUF) with constrained sampling to produce a response that is compatible with the schema.\n", "Function calling in `llama-cpp-python` works by combining models pretrained for function calling such as [`functionary`](https://huggingface.co/meetkai) with constrained sampling to produce a response that is compatible with the schema.\n",
"\n", "\n",
"Note however that this improves but does not guarantee that the response will be compatible with the schema.\n", "Note however that this improves but does not guarantee that the response will be compatible with the schema.\n",
"\n", "\n",