No description

Find a file

Patrick Devine 2fb52261ad basic distribution w/ push/pull (#78 ) * basic distribution w/ push/pull * add the parser * add create, pull, and push * changes to the parser, FROM line, and fix commands * mkdirp new manifest directories * make `blobs` directory if it does not exist * fix go warnings * add progressbar for model pulls * move model struct --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>		2023-07-16 17:02:22 -07:00
api	basic distribution w/ push/pull (#78 )	2023-07-16 17:02:22 -07:00
app	app: set `first-time-run` to `true` instead of `false`	2023-07-14 16:50:12 -07:00
cmd	basic distribution w/ push/pull (#78 )	2023-07-16 17:02:22 -07:00
docs	docs: remove `python.md`	2023-07-14 21:41:46 -07:00
examples/python	examples: add basic python example	2023-07-08 17:40:05 -04:00
llama	fix multibyte responses	2023-07-14 20:11:44 -07:00
parser	basic distribution w/ push/pull (#78 )	2023-07-16 17:02:22 -07:00
scripts	build app in publish script	2023-07-12 19:16:39 -07:00
server	basic distribution w/ push/pull (#78 )	2023-07-16 17:02:22 -07:00
web	web: disable signup button while submitting	2023-07-12 17:32:27 -07:00
.dockerignore	update `Dockerfile`	2023-07-06 16:34:44 -04:00
.gitignore	fix compilation issue in Dockerfile, remove from `README.md` until ready	2023-07-11 19:51:08 -07:00
.prettierrc.json	move .prettierrc.json to root	2023-07-02 17:34:46 -04:00
Dockerfile	fix compilation issue in Dockerfile, remove from `README.md` until ready	2023-07-11 19:51:08 -07:00
ggml-metal.metal	look for ggml-metal in the same directory as the binary	2023-07-11 15:58:56 -07:00
go.mod	no errgroup	2023-07-11 14:58:10 -07:00
go.sum	no errgroup	2023-07-11 14:58:10 -07:00
LICENSE	`proto` -> `ollama`	2023-06-26 15:57:13 -04:00
main.go	continue conversation	2023-07-13 17:13:00 -07:00
models.json	update vicuna model	2023-07-12 09:42:26 -07:00
README.md	update `README.md` API reference	2023-07-12 19:16:28 -07:00

README.md

Ollama

Run large language models with llama.cpp.

Note: certain models that can be run with Ollama are intended for research and/or non-commercial use only.

Features

Download and run popular large language models
Switch between multiple models on the fly
Hardware acceleration where available (Metal, CUDA)
Fast inference server written in Go, powered by llama.cpp
REST API to use with your application (python, typescript SDKs coming soon)

Install

Download for macOS
Download for Windows (coming soon)

You can also build the binary from source.

Quickstart

Run a fast and simple model.

ollama run orca

Example models

💬 Chat

Have a conversation.

ollama run vicuna "Why is the sky blue?"

🗺️ Instructions

Get a helping hand.

ollama run orca "Write an email to my boss."

🔎 Ask questions about documents

Send the contents of a document and ask questions about it.

ollama run nous-hermes "$(cat input.txt)", please summarize this story

📖 Storytelling

Venture into the unknown.

ollama run nous-hermes "Once upon a time"

Advanced usage

Run a local model

ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

Building

go build .

To run it start the server:

./ollama server &

Finally, run a model!

./ollama run ~/Downloads/vicuna-7b-v1.3.ggmlv3.q4_1.bin

API Reference

`POST /api/pull`

Download a model

curl -X POST http://localhost:11343/api/pull -d '{"model": "orca"}'

`POST /api/generate`

Complete a prompt

curl -X POST http://localhost:11434/api/generate -d '{"model": "orca", "prompt": "hello!"}'