2024-02-07 17:24:29 -05:00
# OpenAI compatibility
2024-03-26 13:04:17 -07:00
> **Note:** OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama [Python library](https://github.com/ollama/ollama-python), [JavaScript library](https://github.com/ollama/ollama-js) and [REST API](https://github.com/ollama/ollama/blob/main/docs/api.md).
2024-02-07 17:24:29 -05:00
2024-02-07 17:25:24 -05:00
Ollama provides experimental compatibility with parts of the [OpenAI API ](https://platform.openai.com/docs/api-reference ) to help connect existing applications to Ollama.
2024-02-07 17:24:29 -05:00
## Usage
### OpenAI Python library
```python
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
# required but ignored
api_key='ollama',
)
chat_completion = client.chat.completions.create(
messages=[
{
'role': 'user',
'content': 'Say this is a test',
}
],
2024-05-04 05:25:04 +10:00
model='llama3',
2024-02-07 17:24:29 -05:00
)
```
### OpenAI JavaScript library
```javascript
import OpenAI from 'openai'
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1/',
// required but ignored
apiKey: 'ollama',
})
const chatCompletion = await openai.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }],
2024-05-04 05:25:04 +10:00
model: 'llama3',
2024-02-07 17:24:29 -05:00
})
```
### `curl`
```
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
2024-05-04 05:25:04 +10:00
"model": "llama3",
2024-02-07 17:24:29 -05:00
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
```
## Endpoints
### `/v1/chat/completions`
#### Supported features
- [x] Chat completions
- [x] Streaming
- [x] JSON mode
- [x] Reproducible outputs
- [ ] Vision
- [ ] Function calling
- [ ] Logprobs
#### Supported request fields
- [x] `model`
- [x] `messages`
- [x] Text `content`
- [ ] Array of `content` parts
- [x] `frequency_penalty`
- [x] `presence_penalty`
- [x] `response_format`
- [x] `seed`
- [x] `stop`
- [x] `stream`
- [x] `temperature`
- [x] `top_p`
- [x] `max_tokens`
- [ ] `logit_bias`
- [ ] `tools`
- [ ] `tool_choice`
- [ ] `user`
2024-02-08 15:03:23 -05:00
- [ ] `n`
2024-02-07 17:24:29 -05:00
#### Notes
- Setting `seed` will always set `temperature` to `0`
- `finish_reason` will always be `stop`
- `usage.prompt_tokens` will be 0 for completions where prompt evaluation is cached
## Models
Before using a model, pull it locally `ollama pull` :
```shell
2024-05-04 05:25:04 +10:00
ollama pull llama3
2024-02-07 17:24:29 -05:00
```
### Default model names
For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo` , use `ollama cp` to copy an existing model name to a temporary name:
```
2024-05-04 05:25:04 +10:00
ollama cp llama3 gpt-3.5-turbo
2024-02-07 17:24:29 -05:00
```
Afterwards, this new model name can be specified the `model` field:
```shell
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'
```