No description
Find a file
2023-07-18 12:36:03 -07:00
api add new list command (#97) 2023-07-18 09:09:45 -07:00
app app: use llama2 instead of orca 2023-07-18 12:36:03 -07:00
cmd fix mkdir blob path 2023-07-18 11:24:19 -07:00
docs First stab at a modelfile doc 2023-07-18 08:22:17 -07:00
examples flatten examples 2023-07-18 12:33:50 -07:00
format add new list command (#97) 2023-07-18 09:09:45 -07:00
llama fix multibyte responses 2023-07-14 20:11:44 -07:00
parser convert commands to uppercase in parser 2023-07-17 15:34:08 -07:00
scripts build app in publish script 2023-07-12 19:16:39 -07:00
server fix mkdir blob path 2023-07-18 11:24:19 -07:00
web web: disable signup button while submitting 2023-07-12 17:32:27 -07:00
.dockerignore update Dockerfile 2023-07-06 16:34:44 -04:00
.gitignore fix compilation issue in Dockerfile, remove from README.md until ready 2023-07-11 19:51:08 -07:00
.prettierrc.json move .prettierrc.json to root 2023-07-02 17:34:46 -04:00
Dockerfile fix compilation issue in Dockerfile, remove from README.md until ready 2023-07-11 19:51:08 -07:00
ggml-metal.metal look for ggml-metal in the same directory as the binary 2023-07-11 15:58:56 -07:00
go.mod add new list command (#97) 2023-07-18 09:09:45 -07:00
go.sum add new list command (#97) 2023-07-18 09:09:45 -07:00
LICENSE proto -> ollama 2023-06-26 15:57:13 -04:00
main.go continue conversation 2023-07-13 17:13:00 -07:00
models.json update vicuna model 2023-07-12 09:42:26 -07:00
README.md Add note to README.md about Apple Silicon support 2023-07-17 11:22:34 -07:00

ollama

Ollama

Run large language models with llama.cpp.

Note: certain models that can be run with Ollama are intended for research and/or non-commercial use only.

Features

  • Download and run popular large language models
  • Switch between multiple models on the fly
  • Hardware acceleration where available (Metal, CUDA)
  • Fast inference server written in Go, powered by llama.cpp
  • REST API to use with your application (python, typescript SDKs coming soon)

Install

  • Download for macOS with Apple Silicon (Intel coming soon)
  • Download for Windows (coming soon)

You can also build the binary from source.

Quickstart

Run a fast and simple model.

ollama run orca

Example models

💬 Chat

Have a conversation.

ollama run vicuna "Why is the sky blue?"

🗺️ Instructions

Get a helping hand.

ollama run orca "Write an email to my boss."

🔎 Ask questions about documents

Send the contents of a document and ask questions about it.

ollama run nous-hermes "$(cat input.txt)", please summarize this story

📖 Storytelling

Venture into the unknown.

ollama run nous-hermes "Once upon a time"

Advanced usage

Run a local model

ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

Building

go build .

To run it start the server:

./ollama server &

Finally, run a model!

./ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

API Reference

POST /api/pull

Download a model

curl -X POST http://localhost:11343/api/pull -d '{"model": "orca"}'

POST /api/generate

Complete a prompt

curl -X POST http://localhost:11434/api/generate -d '{"model": "orca", "prompt": "hello!"}'