From bce6dc0ac2f0570a32ced3cd14408237cf08f8ad Mon Sep 17 00:00:00 2001 From: Jeffrey Fong Date: Sat, 24 Feb 2024 01:24:10 +0800 Subject: [PATCH] docs: Update Functionary OpenAI Server Readme (#1193) * update functionary parts in server readme * add write-up about hf tokenizer --- docs/server.md | 8 +++++--- examples/notebooks/Functions.ipynb | 2 +- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/server.md b/docs/server.md index cd351ba..c66c9cc 100644 --- a/docs/server.md +++ b/docs/server.md @@ -76,12 +76,14 @@ Function calling is completely compatible with the OpenAI function calling API a You'll first need to download one of the available function calling models in GGUF format: -- [functionary-7b-v1](https://huggingface.co/abetlen/functionary-7b-v1-GGUF) +- [functionary](https://huggingface.co/meetkai) -Then when you run the server you'll need to also specify the `functionary` chat_format +Then when you run the server you'll need to also specify either `functionary-v1` or `functionary-v2` chat_format. + +Note that since functionary requires a HF Tokenizer due to discrepancies between llama.cpp and HuggingFace's tokenizers as mentioned [here](https://github.com/abetlen/llama-cpp-python/blob/main?tab=readme-ov-file#function-calling), you will need to pass in the path to the tokenizer too. The tokenizer files are already included in the respective HF repositories hosting the gguf files. ```bash -python3 -m llama_cpp.server --model --chat_format functionary +python3 -m llama_cpp.server --model --chat_format functionary-v2 --hf_pretrained_model_name_or_path ``` Check out this [example notebook](https://github.com/abetlen/llama-cpp-python/blob/main/examples/notebooks/Functions.ipynb) for a walkthrough of some interesting use cases for function calling. diff --git a/examples/notebooks/Functions.ipynb b/examples/notebooks/Functions.ipynb index 7a7b899..f1e5e9a 100644 --- a/examples/notebooks/Functions.ipynb +++ b/examples/notebooks/Functions.ipynb @@ -9,7 +9,7 @@ "The OpenAI compatbile web server in `llama-cpp-python` supports function calling.\n", "\n", "Function calling allows API clients to specify a schema that gives the model a format it should respond in.\n", - "Function calling in `llama-cpp-python` works by combining models pretrained for function calling such as [`functionary`](https://huggingface.co/abetlen/functionary-7b-v1-GGUF) with constrained sampling to produce a response that is compatible with the schema.\n", + "Function calling in `llama-cpp-python` works by combining models pretrained for function calling such as [`functionary`](https://huggingface.co/meetkai) with constrained sampling to produce a response that is compatible with the schema.\n", "\n", "Note however that this improves but does not guarantee that the response will be compatible with the schema.\n", "\n",