No description
Find a file
Michael Yang 06ef90c051 fix parameter inheritence
parameters are not inherited because they are processed differently from
other layer. fix this by explicitly merging the inherited params into
the new params. parameter values defined in the new modelfile will
override those defined in the inherited modelfile. array lists are
replaced instead of appended
2023-09-05 11:40:20 -07:00
api s/ListResponseModel/ModelResponse/ 2023-08-31 09:47:10 -04:00
app app: dont package ggml-metal.metal 2023-08-31 17:41:09 -04:00
cmd add model IDs (#439) 2023-08-28 20:50:24 -07:00
docs subprocess llama.cpp server (#401) 2023-08-30 16:35:03 -04:00
examples Merge pull request #273 from jmorganca/matt/moreexamples 2023-08-31 16:31:59 -07:00
format Generate private/public keypair for use w/ auth (#324) 2023-08-11 10:58:23 -07:00
llm generate binary dependencies based on GOARCH on macos (#459) 2023-09-05 12:53:57 -04:00
parser Merge pull request #290 from jmorganca/add-adapter-layers 2023-08-10 17:23:01 -07:00
progressbar vendor in progress bar and change to bytes instead of bibytes (#130) 2023-07-19 17:24:03 -07:00
scripts generate binary dependencies based on GOARCH on macos (#459) 2023-09-05 12:53:57 -04:00
server fix parameter inheritence 2023-09-05 11:40:20 -07:00
vector embed text document in modelfile 2023-08-08 11:27:17 -04:00
version add version 2023-08-22 09:40:58 -07:00
.dockerignore add .env to .dockerignore 2023-08-21 09:32:02 -07:00
.gitignore update docs for subprocess 2023-08-30 17:54:02 -04:00
.gitmodules update docs for subprocess 2023-08-30 17:54:02 -04:00
.prettierrc.json move .prettierrc.json to root 2023-07-02 17:34:46 -04:00
Dockerfile fix compilation issue in Dockerfile, remove from README.md until ready 2023-07-11 19:51:08 -07:00
go.mod use slices.DeleteFunc 2023-09-05 09:56:59 -07:00
go.sum use slices.DeleteFunc 2023-09-05 09:56:59 -07:00
LICENSE proto -> ollama 2023-06-26 15:57:13 -04:00
main.go set non-zero error code on error 2023-08-14 14:09:58 -07:00
README.md update readme (#451) 2023-09-01 16:44:14 -04:00

logo

Ollama

Discord

Run, create, and share large language models (LLMs).

Note: Ollama is in early preview. Please report any issues you find.

Download

Quickstart

To run and chat with Llama 2, the new model by Meta:

ollama run llama2

Model library

Ollama supports a list of open-source models available on ollama.ai/library

Here are some example open-source models that can be downloaded:

Model Parameters Size Download
Llama2 7B 3.8GB ollama pull llama2
Llama2 13B 13B 7.3GB ollama pull llama2:13b
Llama2 70B 70B 39GB ollama pull llama2:70b
Llama2 Uncensored 7B 3.8GB ollama pull llama2-uncensored
Code Llama 7B 3.8GB ollama pull codellama
Orca Mini 3B 1.9GB ollama pull orca-mini
Vicuna 7B 3.8GB ollama pull vicuna
Nous-Hermes 7B 3.8GB ollama pull nous-hermes
Nous-Hermes 13B 13B 7.3GB ollama pull nous-hermes:13b
Wizard Vicuna Uncensored 13B 7.3GB ollama pull wizard-vicuna

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

Examples

Pull a public model

ollama pull llama2

This command can also be used to update a local model. Only updated changes will be pulled.

Run a model interactively

ollama run llama2
>>> hi
Hello! How can I help you today?

For multiline input, you can wrap text with """:

>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.

Run a model non-interactively

$ ollama run llama2 'tell me a joke'
 Sure! Here's a quick one:
 Why did the scarecrow win an award? Because he was outstanding in his field!
$ cat <<EOF >prompts.txt
tell me a joke about llamas
tell me another one
EOF
$ ollama run llama2 <prompts.txt
>>> tell me a joke about llamas
 Why did the llama refuse to play hide-and-seek?
 nobody likes to be hided!

>>> tell me another one
 Sure, here's another one:

Why did the llama go to the bar?
To have a hay-often good time!

Run a model on contents of a text file

$ ollama run llama2 "summarize this file:" "$(cat README.md)"
 Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.

Customize a model

Pull a base model:

ollama pull llama2

Create a Modelfile:

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

For more examples, see the examples directory. For more information on creating a Modelfile, see the Modelfile documentation.

Listing local models

ollama list

Removing local models

ollama rm llama2

Model packages

Overview

Ollama bundles model weights, configurations, and data into a single package, defined by a Modelfile.

logo

Building

Install cmake:

brew install cmake

Then generate dependencies and build:

go generate ./...
go build .

Next, start the server:

./ollama serve

Finally, in a separate shell, run a model:

./ollama run llama2

REST API

See the API documentation for all endpoints.

Ollama has an API for running and managing models. For example to generate text from a model:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'

Community Projects using Ollama