llama.cpp/docs/index.md

124 lines
3.7 KiB
Markdown
Raw Normal View History

2023-04-05 17:44:25 -04:00
# Getting Started
2023-03-24 18:57:59 -04:00
2023-04-05 17:44:25 -04:00
## 🦙 Python Bindings for `llama.cpp`
[![Documentation](https://img.shields.io/badge/docs-passing-green.svg)](https://abetlen.github.io/llama-cpp-python)
[![Tests](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml)
2023-03-24 19:02:36 -04:00
[![PyPI](https://img.shields.io/pypi/v/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
[![PyPI - License](https://img.shields.io/pypi/l/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
2023-03-24 18:57:59 -04:00
Simple Python bindings for **@ggerganov's** [`llama.cpp`](https://github.com/ggerganov/llama.cpp) library.
This package provides:
- Low-level access to C API via `ctypes` interface.
- High-level Python API for text completion
- OpenAI-like API
- LangChain compatibility
2023-03-24 19:02:36 -04:00
## Installation
Install from PyPI:
```bash
pip install llama-cpp-python
```
2023-04-05 17:44:25 -04:00
## High-level API
2023-03-24 19:02:36 -04:00
```python
>>> from llama_cpp import Llama
>>> llm = Llama(model_path="./models/7B/ggml-model.bin")
2023-03-24 19:02:36 -04:00
>>> output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
>>> print(output)
{
"id": "cmpl-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"object": "text_completion",
"created": 1679561337,
"model": "./models/7B/ggml-model.bin",
2023-03-24 19:02:36 -04:00
"choices": [
{
"text": "Q: Name the planets in the solar system? A: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune and Pluto.",
"index": 0,
"logprobs": None,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 14,
"completion_tokens": 28,
"total_tokens": 42
}
}
```
2023-04-05 17:44:25 -04:00
## Web Server
`llama-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).
To install the server package and get started:
```bash
pip install llama-cpp-python[server]
export MODEL=./models/7B/ggml-model.bin
2023-04-05 17:44:25 -04:00
python3 -m llama_cpp.server
```
Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the OpenAPI documentation.
## Low-level API
The low-level API is a direct `ctypes` binding to the C API provided by `llama.cpp`.
The entire API can be found in [llama_cpp/llama_cpp.py](https://github.com/abetlen/llama-cpp-python/blob/master/llama_cpp/llama_cpp.py) and should mirror [llama.h](https://github.com/ggerganov/llama.cpp/blob/master/llama.h).
2023-04-01 13:03:56 -04:00
## Development
2023-04-05 17:44:25 -04:00
This package is under active development and I welcome any contributions.
To get started, clone the repository and install the package in development mode:
2023-04-01 13:03:56 -04:00
```bash
git clone git@github.com:abetlen/llama-cpp-python.git
git submodule update --init --recursive
# Will need to be re-run any time vendor/llama.cpp is updated
python3 setup.py develop
```
2023-03-24 18:57:59 -04:00
## API Reference
::: llama_cpp.Llama
options:
members:
- __init__
2023-03-28 05:04:15 -04:00
- tokenize
- detokenize
2023-04-02 00:09:51 -04:00
- reset
- eval
- sample
2023-04-01 17:29:43 -04:00
- generate
2023-04-01 13:04:12 -04:00
- create_embedding
2023-04-03 18:53:00 -04:00
- embed
2023-04-01 13:04:12 -04:00
- create_completion
- __call__
2023-04-03 20:14:03 -04:00
- create_chat_completion
2023-04-15 22:31:14 -04:00
- set_cache
2023-04-24 19:56:57 -04:00
- save_state
- load_state
- token_bos
- token_eos
2023-03-24 18:57:59 -04:00
show_root_heading: true
2023-04-15 22:31:14 -04:00
::: llama_cpp.LlamaCache
2023-04-24 19:56:57 -04:00
::: llama_cpp.LlamaState
2023-03-24 18:57:59 -04:00
::: llama_cpp.llama_cpp
options:
2023-03-24 19:02:36 -04:00
show_if_no_docstring: true
## License
This project is licensed under the terms of the MIT license.