No description

Find a file

Jeffrey Morgan 944bd13de1 Go		2023-07-06 16:34:44 -04:00
api	client updates	2023-07-06 16:34:44 -04:00
app	auto updater for macos	2023-07-06 00:04:06 -04:00
cmd	client updates	2023-07-06 16:34:44 -04:00
docs	Move python docs to separate file	2023-07-01 17:54:29 -04:00
llama	remove debug print statements	2023-07-06 16:34:44 -04:00
server	remove debug print statements	2023-07-06 16:34:44 -04:00
templates	move prompt templates out of python bindings	2023-07-06 16:34:44 -04:00
web	fix auto update route	2023-07-06 16:18:40 -04:00
.dockerignore	update `Dockerfile`	2023-07-06 16:34:44 -04:00
.gitignore	add binary to .gitignore	2023-07-06 16:34:44 -04:00
.prettierrc.json	move .prettierrc.json to root	2023-07-02 17:34:46 -04:00
Dockerfile	update `Dockerfile`	2023-07-06 16:34:44 -04:00
go.mod	client updates	2023-07-06 16:34:44 -04:00
go.sum	client updates	2023-07-06 16:34:44 -04:00
LICENSE	`proto` -> `ollama`	2023-06-26 15:57:13 -04:00
main.go	add llama.cpp go bindings	2023-07-06 16:34:44 -04:00
models.json	format `models.json`	2023-07-02 20:33:23 -04:00
README.md	Go	2023-07-06 16:34:44 -04:00

README.md

Ollama

Run large language models with llama.cpp.

Note: certain models that can be run with this project are intended for research and/or non-commercial use only.

Features

Download and run popular large language models
Switch between multiple models on the fly
Hardware acceleration where available (Metal, CUDA)
Fast inference server written in Go, powered by llama.cpp
REST API to use with your application (python, typescript SDKs coming soon)

Install

Download for macOS
Download for Windows (coming soon)
Docker: docker run -p 8080:8080 ollama/ollama

You can also build the binary from source.

Quickstart

Run the model that started it all.

ollama run llama

Example models

💬 Chat

Have a conversation.

ollama run vicuna "Why is the sky blue?"

🗺️ Instructions

Ask questions. Get answers.

ollama run orca "Write an email to my boss."

👩‍💻 Code completion

Sometimes you just need a little help writing code.

ollama run replit "Give me react code to render a button"

📖 Storytelling

Venture into the unknown.

ollama run storyteller "Once upon a time"

Building

go generate ./...
go build .

To run it start the server:

./ollama server &

Finally, run a model!

./ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

API Reference

`POST /completion`

Complete a prompt

curl -X POST http://localhost:8080/completion -H 'Content-Type: application/json' -d '{"model": "/path/to/model", "prompt": "Once upon a time", "stream": true}'