No description
Find a file
2023-07-08 19:19:31 -04:00
api pr feedback 2023-07-07 17:12:02 -04:00
app remove double heartbeat 2023-07-08 13:30:27 -04:00
cmd pr feedback 2023-07-07 17:12:02 -04:00
docs add publish script 2023-07-07 12:59:45 -04:00
examples/python examples: add basic python example 2023-07-08 17:40:05 -04:00
llama pass model and predict options 2023-07-07 09:34:05 -07:00
scripts fix download url 2023-07-07 16:07:19 -04:00
server add basic / route for server 2023-07-07 23:46:15 -04:00
web web: use proper caching for autoupdate endpoint 2023-07-08 16:48:02 -04:00
.dockerignore update Dockerfile 2023-07-06 16:34:44 -04:00
.gitignore use Makefile for dependency building instead of go generate 2023-07-06 16:34:44 -04:00
.prettierrc.json move .prettierrc.json to root 2023-07-02 17:34:46 -04:00
Dockerfile update Dockerfile to use OLLAMA_HOST 2023-07-07 23:43:50 -04:00
go.mod progress 2023-07-06 17:07:40 -07:00
go.sum progress 2023-07-06 17:07:40 -07:00
LICENSE proto -> ollama 2023-06-26 15:57:13 -04:00
main.go add llama.cpp go bindings 2023-07-06 16:34:44 -04:00
Makefile add publish script 2023-07-07 12:59:45 -04:00
models.json remove replit example which does not run currently 2023-07-07 12:39:42 -04:00
README.md update README.md instructions section 2023-07-08 19:19:31 -04:00

ollama

Ollama

Run large language models with llama.cpp.

Note: certain models that can be run with Ollama are intended for research and/or non-commercial use only.

Features

  • Download and run popular large language models
  • Switch between multiple models on the fly
  • Hardware acceleration where available (Metal, CUDA)
  • Fast inference server written in Go, powered by llama.cpp
  • REST API to use with your application (python, typescript SDKs coming soon)

Install

  • Download for macOS
  • Download for Windows (coming soon)
  • Docker: docker run -p 11434:11434 ollama/ollama

You can also build the binary from source.

Quickstart

Run a fast and simple model.

ollama run orca

Example models

💬 Chat

Have a conversation.

ollama run vicuna "Why is the sky blue?"

🗺️ Instructions

Get a helping hand.

ollama run orca "Write an email to my boss."

📖 Storytelling

Venture into the unknown.

ollama run nous-hermes "Once upon a time"

Advanced usage

Run a local model

ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

Building

make

To run it start the server:

./ollama server &

Finally, run a model!

./ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

API Reference

POST /api/pull

Download a model

curl -X POST http://localhost:11343/api/pull -d '{"model": "orca"}'

POST /api/generate

Complete a prompt

curl -X POST http://localhost:11434/api/generate -d '{"model": "orca", "prompt": "hello!", "stream": true}'