192 lines
4.2 KiB
Markdown
192 lines
4.2 KiB
Markdown
# OpenAI compatibility
|
|
|
|
> **Note:** OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama [Python library](https://github.com/ollama/ollama-python), [JavaScript library](https://github.com/ollama/ollama-js) and [REST API](https://github.com/ollama/ollama/blob/main/docs/api.md).
|
|
|
|
Ollama provides experimental compatibility with parts of the [OpenAI API](https://platform.openai.com/docs/api-reference) to help connect existing applications to Ollama.
|
|
|
|
## Usage
|
|
|
|
### OpenAI Python library
|
|
|
|
```python
|
|
from openai import OpenAI
|
|
|
|
client = OpenAI(
|
|
base_url='http://localhost:11434/v1/',
|
|
|
|
# required but ignored
|
|
api_key='ollama',
|
|
)
|
|
|
|
chat_completion = client.chat.completions.create(
|
|
messages=[
|
|
{
|
|
'role': 'user',
|
|
'content': 'Say this is a test',
|
|
}
|
|
],
|
|
model='llama3',
|
|
)
|
|
|
|
list_completion = client.models.list()
|
|
|
|
model = client.models.retrieve("llama3")
|
|
|
|
embeddings = client.embeddings.create(
|
|
model="all-minilm",
|
|
input=["why is the sky blue?", "why is the grass green?"]
|
|
)
|
|
```
|
|
|
|
### OpenAI JavaScript library
|
|
|
|
```javascript
|
|
import OpenAI from 'openai'
|
|
|
|
const openai = new OpenAI({
|
|
baseURL: 'http://localhost:11434/v1/',
|
|
|
|
// required but ignored
|
|
apiKey: 'ollama',
|
|
})
|
|
|
|
const chatCompletion = await openai.chat.completions.create({
|
|
messages: [{ role: 'user', content: 'Say this is a test' }],
|
|
model: 'llama3',
|
|
})
|
|
|
|
const listCompletion = await openai.models.list()
|
|
|
|
const model = await openai.models.retrieve("llama3");
|
|
|
|
const embedding = await openai.embeddings.create({
|
|
model: "all-minilm",
|
|
input: ["why is the sky blue?", "why is the grass green?"],
|
|
});
|
|
```
|
|
|
|
### `curl`
|
|
|
|
```
|
|
curl http://localhost:11434/v1/chat/completions \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "llama3",
|
|
"messages": [
|
|
{
|
|
"role": "system",
|
|
"content": "You are a helpful assistant."
|
|
},
|
|
{
|
|
"role": "user",
|
|
"content": "Hello!"
|
|
}
|
|
]
|
|
}'
|
|
|
|
curl http://localhost:11434/v1/models
|
|
|
|
curl http://localhost:11434/v1/models/llama3
|
|
|
|
curl http://localhost:11434/v1/embeddings \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "all-minilm",
|
|
"input": ["why is the sky blue?", "why is the grass green?"]
|
|
}'
|
|
```
|
|
|
|
## Endpoints
|
|
|
|
### `/v1/chat/completions`
|
|
|
|
#### Supported features
|
|
|
|
- [x] Chat completions
|
|
- [x] Streaming
|
|
- [x] JSON mode
|
|
- [x] Reproducible outputs
|
|
- [x] Tools (streaming support coming soon)
|
|
- [ ] Vision
|
|
- [ ] Logprobs
|
|
|
|
#### Supported request fields
|
|
|
|
- [x] `model`
|
|
- [x] `messages`
|
|
- [x] Text `content`
|
|
- [ ] Array of `content` parts
|
|
- [x] `frequency_penalty`
|
|
- [x] `presence_penalty`
|
|
- [x] `response_format`
|
|
- [x] `seed`
|
|
- [x] `stop`
|
|
- [x] `stream`
|
|
- [x] `temperature`
|
|
- [x] `top_p`
|
|
- [x] `max_tokens`
|
|
- [x] `tools`
|
|
- [ ] `tool_choice`
|
|
- [ ] `logit_bias`
|
|
- [ ] `user`
|
|
- [ ] `n`
|
|
|
|
### `/v1/models`
|
|
|
|
#### Notes
|
|
|
|
- `created` corresponds to when the model was last modified
|
|
- `owned_by` corresponds to the ollama username, defaulting to `"library"`
|
|
|
|
### `/v1/models/{model}`
|
|
|
|
#### Notes
|
|
|
|
- `created` corresponds to when the model was last modified
|
|
- `owned_by` corresponds to the ollama username, defaulting to `"library"`
|
|
|
|
### `/v1/embeddings`
|
|
|
|
#### Supported request fields
|
|
|
|
- [x] `model`
|
|
- [x] `input`
|
|
- [x] string
|
|
- [x] array of strings
|
|
- [ ] array of tokens
|
|
- [ ] array of token arrays
|
|
- [ ] `encoding format`
|
|
- [ ] `dimensions`
|
|
- [ ] `user`
|
|
|
|
## Models
|
|
|
|
Before using a model, pull it locally `ollama pull`:
|
|
|
|
```shell
|
|
ollama pull llama3
|
|
```
|
|
|
|
### Default model names
|
|
|
|
For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo`, use `ollama cp` to copy an existing model name to a temporary name:
|
|
|
|
```
|
|
ollama cp llama3 gpt-3.5-turbo
|
|
```
|
|
|
|
Afterwards, this new model name can be specified the `model` field:
|
|
|
|
```shell
|
|
curl http://localhost:11434/v1/chat/completions \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "gpt-3.5-turbo",
|
|
"messages": [
|
|
{
|
|
"role": "user",
|
|
"content": "Hello!"
|
|
}
|
|
]
|
|
}'
|
|
```
|