No description
Find a file
Michael Yang 8dc68417e7
Merge pull request #463 from jmorganca/mxyng/fix-last-token
fix not forwarding last token
2023-09-05 09:01:32 -07:00
api s/ListResponseModel/ModelResponse/ 2023-08-31 09:47:10 -04:00
app app: dont package ggml-metal.metal 2023-08-31 17:41:09 -04:00
cmd add model IDs (#439) 2023-08-28 20:50:24 -07:00
docs subprocess llama.cpp server (#401) 2023-08-30 16:35:03 -04:00
examples Merge pull request #273 from jmorganca/matt/moreexamples 2023-08-31 16:31:59 -07:00
format Generate private/public keypair for use w/ auth (#324) 2023-08-11 10:58:23 -07:00
llm fix not forwarding last token 2023-09-03 17:46:50 -04:00
parser Merge pull request #290 from jmorganca/add-adapter-layers 2023-08-10 17:23:01 -07:00
progressbar vendor in progress bar and change to bytes instead of bibytes (#130) 2023-07-19 17:24:03 -07:00
scripts build release mode 2023-08-22 09:52:43 -07:00
server do not HTML-escape prompt 2023-09-01 17:16:38 -05:00
vector embed text document in modelfile 2023-08-08 11:27:17 -04:00
version add version 2023-08-22 09:40:58 -07:00
.dockerignore add .env to .dockerignore 2023-08-21 09:32:02 -07:00
.gitignore update docs for subprocess 2023-08-30 17:54:02 -04:00
.gitmodules update docs for subprocess 2023-08-30 17:54:02 -04:00
.prettierrc.json move .prettierrc.json to root 2023-07-02 17:34:46 -04:00
Dockerfile fix compilation issue in Dockerfile, remove from README.md until ready 2023-07-11 19:51:08 -07:00
go.mod subprocess llama.cpp server (#401) 2023-08-30 16:35:03 -04:00
go.sum subprocess llama.cpp server (#401) 2023-08-30 16:35:03 -04:00
LICENSE proto -> ollama 2023-06-26 15:57:13 -04:00
main.go set non-zero error code on error 2023-08-14 14:09:58 -07:00
README.md update readme (#451) 2023-09-01 16:44:14 -04:00

logo

Ollama

Discord

Run, create, and share large language models (LLMs).

Note: Ollama is in early preview. Please report any issues you find.

Download

Quickstart

To run and chat with Llama 2, the new model by Meta:

ollama run llama2

Model library

Ollama supports a list of open-source models available on ollama.ai/library

Here are some example open-source models that can be downloaded:

Model Parameters Size Download
Llama2 7B 3.8GB ollama pull llama2
Llama2 13B 13B 7.3GB ollama pull llama2:13b
Llama2 70B 70B 39GB ollama pull llama2:70b
Llama2 Uncensored 7B 3.8GB ollama pull llama2-uncensored
Code Llama 7B 3.8GB ollama pull codellama
Orca Mini 3B 1.9GB ollama pull orca-mini
Vicuna 7B 3.8GB ollama pull vicuna
Nous-Hermes 7B 3.8GB ollama pull nous-hermes
Nous-Hermes 13B 13B 7.3GB ollama pull nous-hermes:13b
Wizard Vicuna Uncensored 13B 7.3GB ollama pull wizard-vicuna

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

Examples

Pull a public model

ollama pull llama2

This command can also be used to update a local model. Only updated changes will be pulled.

Run a model interactively

ollama run llama2
>>> hi
Hello! How can I help you today?

For multiline input, you can wrap text with """:

>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.

Run a model non-interactively

$ ollama run llama2 'tell me a joke'
 Sure! Here's a quick one:
 Why did the scarecrow win an award? Because he was outstanding in his field!
$ cat <<EOF >prompts.txt
tell me a joke about llamas
tell me another one
EOF
$ ollama run llama2 <prompts.txt
>>> tell me a joke about llamas
 Why did the llama refuse to play hide-and-seek?
 nobody likes to be hided!

>>> tell me another one
 Sure, here's another one:

Why did the llama go to the bar?
To have a hay-often good time!

Run a model on contents of a text file

$ ollama run llama2 "summarize this file:" "$(cat README.md)"
 Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.

Customize a model

Pull a base model:

ollama pull llama2

Create a Modelfile:

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

For more examples, see the examples directory. For more information on creating a Modelfile, see the Modelfile documentation.

Listing local models

ollama list

Removing local models

ollama rm llama2

Model packages

Overview

Ollama bundles model weights, configurations, and data into a single package, defined by a Modelfile.

logo

Building

Install cmake:

brew install cmake

Then generate dependencies and build:

go generate ./...
go build .

Next, start the server:

./ollama serve

Finally, in a separate shell, run a model:

./ollama run llama2

REST API

See the API documentation for all endpoints.

Ollama has an API for running and managing models. For example to generate text from a model:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'

Community Projects using Ollama