No description
Find a file
2023-07-06 16:34:44 -04:00
api use prompt templates 2023-07-06 16:34:44 -04:00
app add .metal file to the app build 2023-07-06 16:34:44 -04:00
cmd pull models 2023-07-06 16:34:44 -04:00
docs Move python docs to separate file 2023-07-01 17:54:29 -04:00
llama use Makefile for dependency building instead of go generate 2023-07-06 16:34:44 -04:00
server use prompt templates 2023-07-06 16:34:44 -04:00
templates use prompt templates 2023-07-06 16:34:44 -04:00
web fix auto update route 2023-07-06 16:18:40 -04:00
.dockerignore update Dockerfile 2023-07-06 16:34:44 -04:00
.gitignore use Makefile for dependency building instead of go generate 2023-07-06 16:34:44 -04:00
.prettierrc.json move .prettierrc.json to root 2023-07-02 17:34:46 -04:00
Dockerfile update Dockerfile 2023-07-06 16:34:44 -04:00
go.mod use prompt templates 2023-07-06 16:34:44 -04:00
go.sum use prompt templates 2023-07-06 16:34:44 -04:00
LICENSE proto -> ollama 2023-06-26 15:57:13 -04:00
main.go add llama.cpp go bindings 2023-07-06 16:34:44 -04:00
Makefile update app to use go binary 2023-07-06 16:34:44 -04:00
models.json Update models.json 2023-07-06 16:34:44 -04:00
README.md update README.md build instructions 2023-07-06 16:34:44 -04:00

ollama

Ollama

Run large language models with llama.cpp.

Note: certain models that can be run with this project are intended for research and/or non-commercial use only.

Features

  • Download and run popular large language models
  • Switch between multiple models on the fly
  • Hardware acceleration where available (Metal, CUDA)
  • Fast inference server written in Go, powered by llama.cpp
  • REST API to use with your application (python, typescript SDKs coming soon)

Install

  • Download for macOS
  • Download for Windows (coming soon)
  • Docker: docker run -p 8080:8080 ollama/ollama

You can also build the binary from source.

Quickstart

Run the model that started it all.

ollama run llama

Example models

💬 Chat

Have a conversation.

ollama run vicuna "Why is the sky blue?"

🗺️ Instructions

Ask questions. Get answers.

ollama run orca "Write an email to my boss."

👩‍💻 Code completion

Sometimes you just need a little help writing code.

ollama run replit "Give me react code to render a button"

📖 Storytelling

Venture into the unknown.

ollama run storyteller "Once upon a time"

Building

make

To run it start the server:

./ollama server &

Finally, run a model!

./ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

API Reference

POST /completion

Complete a prompt

curl --unix-socket ~/.ollama/ollama.sock http://localhost/api/generate \
    -X POST \
    -d '{"model": "/path/to/model", "prompt": "Once upon a time", "stream": true}'