No description

Find a file

Jeffrey Morgan 3c3c09a523 set version at build time		2023-07-06 16:34:44 -04:00
api	use prompt templates	2023-07-06 16:34:44 -04:00
app	set version at build time	2023-07-06 16:34:44 -04:00
cmd	pull models	2023-07-06 16:34:44 -04:00
docs	Move python docs to separate file	2023-07-01 17:54:29 -04:00
llama	use `Makefile` for dependency building instead of `go generate`	2023-07-06 16:34:44 -04:00
server	use prompt templates	2023-07-06 16:34:44 -04:00
templates	use prompt templates	2023-07-06 16:34:44 -04:00
web	set version at build time	2023-07-06 16:34:44 -04:00
.dockerignore	update `Dockerfile`	2023-07-06 16:34:44 -04:00
.gitignore	use `Makefile` for dependency building instead of `go generate`	2023-07-06 16:34:44 -04:00
.prettierrc.json	move .prettierrc.json to root	2023-07-02 17:34:46 -04:00
Dockerfile	update `Dockerfile`	2023-07-06 16:34:44 -04:00
go.mod	use prompt templates	2023-07-06 16:34:44 -04:00
go.sum	use prompt templates	2023-07-06 16:34:44 -04:00
LICENSE	`proto` -> `ollama`	2023-06-26 15:57:13 -04:00
main.go	add llama.cpp go bindings	2023-07-06 16:34:44 -04:00
Makefile	update app to use go binary	2023-07-06 16:34:44 -04:00
models.json	Update models.json	2023-07-06 16:34:44 -04:00
README.md	update `README.md` build instructions	2023-07-06 16:34:44 -04:00

README.md

Ollama

Run large language models with llama.cpp.

Note: certain models that can be run with this project are intended for research and/or non-commercial use only.

Features

Download and run popular large language models
Switch between multiple models on the fly
Hardware acceleration where available (Metal, CUDA)
Fast inference server written in Go, powered by llama.cpp
REST API to use with your application (python, typescript SDKs coming soon)

Install

Download for macOS
Download for Windows (coming soon)
Docker: docker run -p 8080:8080 ollama/ollama

You can also build the binary from source.

Quickstart

Run the model that started it all.

ollama run llama

Example models

💬 Chat

Have a conversation.

ollama run vicuna "Why is the sky blue?"

🗺️ Instructions

Ask questions. Get answers.

ollama run orca "Write an email to my boss."

👩‍💻 Code completion

Sometimes you just need a little help writing code.

ollama run replit "Give me react code to render a button"

📖 Storytelling

Venture into the unknown.

ollama run storyteller "Once upon a time"

Building

make

To run it start the server:

./ollama server &

Finally, run a model!

./ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

API Reference

`POST /completion`

Complete a prompt

curl --unix-socket ~/.ollama/ollama.sock http://localhost/api/generate \
    -X POST \
    -d '{"model": "/path/to/model", "prompt": "Once upon a time", "stream": true}'