commit
d7d33e5255
2 changed files with 68 additions and 0 deletions
1
examples/flyio/.gitignore
vendored
Normal file
1
examples/flyio/.gitignore
vendored
Normal file
|
@ -0,0 +1 @@
|
|||
fly.toml
|
67
examples/flyio/README.md
Normal file
67
examples/flyio/README.md
Normal file
|
@ -0,0 +1,67 @@
|
|||
# Deploy Ollama to Fly.io
|
||||
|
||||
> Note: this example exposes a public endpoint and does not configure authentication. Use with care.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Ollama: https://ollama.com/download
|
||||
- Fly.io account. Sign up for a free account: https://fly.io/app/sign-up
|
||||
|
||||
## Steps
|
||||
|
||||
1. Login to Fly.io
|
||||
|
||||
```bash
|
||||
fly auth login
|
||||
```
|
||||
|
||||
1. Create a new Fly app
|
||||
|
||||
```bash
|
||||
fly launch --name <name> --image ollama/ollama --internal-port 11434 --vm-size shared-cpu-8x --now
|
||||
```
|
||||
|
||||
1. Pull and run `orca-mini:3b`
|
||||
|
||||
```bash
|
||||
OLLAMA_HOST=https://<name>.fly.dev ollama run orca-mini:3b
|
||||
```
|
||||
|
||||
`shared-cpu-8x` is a free-tier eligible machine type. For better performance, switch to a `performance` or `dedicated` machine type or attach a GPU for hardware acceleration (see below).
|
||||
|
||||
## (Optional) Persistent Volume
|
||||
|
||||
By default Fly Machines use ephemeral storage which is problematic if you want to use the same model across restarts without pulling it again. Create and attach a persistent volume to store the downloaded models:
|
||||
|
||||
1. Create the Fly Volume
|
||||
|
||||
```bash
|
||||
fly volume create ollama
|
||||
```
|
||||
|
||||
1. Update `fly.toml` and add `[mounts]`
|
||||
|
||||
```toml
|
||||
[mounts]
|
||||
source = "ollama"
|
||||
destination = "/mnt/ollama/models"
|
||||
```
|
||||
|
||||
1. Update `fly.toml` and add `[env]`
|
||||
|
||||
```toml
|
||||
[env]
|
||||
OLLAMA_MODELS = "/mnt/ollama/models"
|
||||
```
|
||||
|
||||
1. Deploy your app
|
||||
|
||||
```bash
|
||||
fly deploy
|
||||
```
|
||||
|
||||
## (Optional) Hardware Acceleration
|
||||
|
||||
Fly.io GPU is currently in waitlist. Sign up for the waitlist: https://fly.io/gpu
|
||||
|
||||
Once you've been accepted, create the app with the additional flags `--vm-gpu-kind a100-pcie-40gb` or `--vm-gpu-kind a100-pcie-80gb`.
|
Loading…
Reference in a new issue