No description

Find a file

Jeffrey Morgan 413a2e4f91 set `DEBIAN_FRONTEND=noninteractive` correctly		2023-09-24 20:35:42 -07:00
api	DRAFT: add a simple python client to access ollama (#522 )	2023-09-14 16:37:38 -07:00
app	app: dont package `ggml-metal.metal`	2023-08-31 17:41:09 -04:00
cmd	fix end-of-line issue with the new prompt (#582 )	2023-09-23 17:20:30 -07:00
docs	embed libraries using cmake	2023-09-20 14:41:57 -07:00
examples	Merge pull request #273 from jmorganca/matt/moreexamples	2023-08-31 16:31:59 -07:00
format	remove unused openssh key types	2023-09-06 14:34:09 -07:00
llm	silence warm up log	2023-09-21 14:53:33 -07:00
parser	Merge pull request #290 from jmorganca/add-adapter-layers	2023-08-10 17:23:01 -07:00
progressbar	vendor in progress bar and change to bytes instead of bibytes (#130 )	2023-07-19 17:24:03 -07:00
scripts	set `DEBIAN_FRONTEND=noninteractive` correctly	2023-09-24 20:35:42 -07:00
server	check other request fields before load short circuit in `/api/generate`	2023-09-22 23:50:55 -04:00
vector	embed text document in modelfile	2023-08-08 11:27:17 -04:00
version	add version	2023-08-22 09:40:58 -07:00
.dockerignore	add `.env` to `.dockerignore`	2023-09-23 00:53:48 -04:00
.gitignore	update docs for subprocess	2023-08-30 17:54:02 -04:00
.gitmodules	silence warm up log	2023-09-21 14:53:33 -07:00
.prettierrc.json	move .prettierrc.json to root	2023-07-02 17:34:46 -04:00
Dockerfile	replace dockerfile	2023-09-22 11:57:38 -07:00
Dockerfile.build	Add `Dockerfile.build` for building linux binaries (#558 )	2023-09-22 15:20:12 -04:00
go.mod	switch to forked readline lib which doesn't wreck the repl prompt (#578 )	2023-09-22 12:17:45 -07:00
go.sum	add word wrapping for lines which are longer than the terminal width (#553 )	2023-09-22 13:36:08 -07:00
LICENSE	`proto` -> `ollama`	2023-06-26 15:57:13 -04:00
main.go	set non-zero error code on error	2023-08-14 14:09:58 -07:00
README.md	Updated README section on community projects for table (#550 )	2023-09-18 15:22:50 -04:00

README.md

Ollama

Run, create, and share large language models (LLMs).

Note: Ollama is in early preview. Please report any issues you find.

Download

Download for macOS
Download for Windows and Linux (coming soon)
Build from source

Quickstart

To run and chat with Llama 2, the new model by Meta:

ollama run llama2

Model library

Ollama supports a list of open-source models available on ollama.ai/library

Here are some example open-source models that can be downloaded:

Model	Parameters	Size	Download
Llama2	7B	3.8GB	`ollama pull llama2`
Llama2 13B	13B	7.3GB	`ollama pull llama2:13b`
Llama2 70B	70B	39GB	`ollama pull llama2:70b`
Llama2 Uncensored	7B	3.8GB	`ollama pull llama2-uncensored`
Code Llama	7B	3.8GB	`ollama pull codellama`
Orca Mini	3B	1.9GB	`ollama pull orca-mini`
Vicuna	7B	3.8GB	`ollama pull vicuna`
Nous-Hermes	7B	3.8GB	`ollama pull nous-hermes`
Nous-Hermes 13B	13B	7.3GB	`ollama pull nous-hermes:13b`
Wizard Vicuna Uncensored	13B	7.3GB	`ollama pull wizard-vicuna`

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

Examples

Pull a public model

ollama pull llama2

This command can also be used to update a local model. Only updated changes will be pulled.

Run a model interactively

ollama run llama2
>>> hi
Hello! How can I help you today?

For multiline input, you can wrap text with """:

>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.

Run a model non-interactively

$ ollama run llama2 'tell me a joke'
 Sure! Here's a quick one:
 Why did the scarecrow win an award? Because he was outstanding in his field!

$ cat <<EOF >prompts.txt
tell me a joke about llamas
tell me another one
EOF
$ ollama run llama2 <prompts.txt
>>> tell me a joke about llamas
 Why did the llama refuse to play hide-and-seek?
 nobody likes to be hided!

>>> tell me another one
 Sure, here's another one:

Why did the llama go to the bar?
To have a hay-often good time!

Run a model on contents of a text file

$ ollama run llama2 "summarize this file:" "$(cat README.md)"
 Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.

Customize a model

Pull a base model:

ollama pull llama2

Create a Modelfile:

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

For more examples, see the examples directory. For more information on creating a Modelfile, see the Modelfile documentation.

Listing local models

ollama list

Removing local models

ollama rm llama2

Model packages

Overview

Ollama bundles model weights, configurations, and data into a single package, defined by a Modelfile.

Building

Install cmake and go:

brew install cmake
brew install go

Then generate dependencies and build:

go generate ./...
go build .

Next, start the server:

./ollama serve

Finally, in a separate shell, run a model:

./ollama run llama2

REST API

See the API documentation for all endpoints.

Ollama has an API for running and managing models. For example to generate text from a model:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'

Community Projects using Ollama

Project	Description
LangChain and LangChain.js	Also, there is a question-answering example.
Continue	Embeds Ollama inside Visual Studio Code. The extension lets you highlight code to add to the prompt, ask questions in the sidebar, and generate code inline.
LiteLLM	Lightweight Python package to simplify LLM API calls.
Discord AI Bot	Interact with Ollama as a chatbot on Discord.
Raycast Ollama	Raycast extension to use Ollama for local llama inference on Raycast.
Simple HTML UI	Also, there is a Chrome extension.
Emacs client