No description
Find a file
Michael Yang 4dc5b117dd automatically set num_keep if num_keep < 0
num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions
2023-08-07 16:19:12 -07:00
api automatically set num_keep if num_keep < 0 2023-08-07 16:19:12 -07:00
app hide dock icon if window closes 2023-08-02 11:05:34 -04:00
cmd tell users to check the server error logs 2023-08-02 17:08:11 -04:00
docs get rid of namespace and site 2023-08-07 13:27:58 -07:00
examples Update Modelfile 2023-07-24 16:43:11 -04:00
format add new list command (#97) 2023-07-18 09:09:45 -07:00
library remove colon from library modelfiles 2023-07-20 09:51:30 -07:00
llama automatically set num_keep if num_keep < 0 2023-08-07 16:19:12 -07:00
parser use max scan token size to hold large objects 2023-07-28 11:43:31 -07:00
progressbar vendor in progress bar and change to bytes instead of bibytes (#130) 2023-07-19 17:24:03 -07:00
scripts run npm install on build 2023-08-01 17:41:25 -04:00
server automatically set num_keep if num_keep < 0 2023-08-07 16:19:12 -07:00
web Update discord invite link 2023-07-27 15:43:15 -04:00
.dockerignore update Dockerfile 2023-07-06 16:34:44 -04:00
.gitignore add ggml-metal.metal to .gitignore 2023-07-28 11:04:21 -04:00
.prettierrc.json move .prettierrc.json to root 2023-07-02 17:34:46 -04:00
Dockerfile fix compilation issue in Dockerfile, remove from README.md until ready 2023-07-11 19:51:08 -07:00
go.mod read runner parameter options from map 2023-08-01 13:38:19 -04:00
go.sum read runner parameter options from map 2023-08-01 13:38:19 -04:00
LICENSE proto -> ollama 2023-06-26 15:57:13 -04:00
main.go continue conversation 2023-07-13 17:13:00 -07:00
README.md langchain JS integration (#302) 2023-08-07 12:21:36 -04:00

logo

Ollama

Discord

Note: Ollama is in early preview. Please report any issues you find.

Run, create, and share large language models (LLMs).

Download

Quickstart

To run and chat with Llama 2, the new model by Meta:

ollama run llama2

Model library

ollama includes a library of open-source models:

Model Parameters Size Download
Llama2 7B 3.8GB ollama pull llama2
Llama2 Uncensored 7B 3.8GB ollama pull llama2-uncensored
Llama2 13B 13B 7.3GB ollama pull llama2:13b
Orca Mini 3B 1.9GB ollama pull orca
Vicuna 7B 3.8GB ollama pull vicuna
Nous-Hermes 13B 7.3GB ollama pull nous-hermes
Wizard Vicuna Uncensored 13B 7.3GB ollama pull wizard-vicuna

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

Examples

Run a model

ollama run llama2
>>> hi
Hello! How can I help you today?

Create a custom model

Pull a base model:

ollama pull llama2

To update a model to the latest version, run ollama pull llama2 again. The model will be updated (if necessary).

Create a Modelfile:

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

For more examples, see the examples directory.

For more information on creating a Modelfile, see the Modelfile documentation.

Pull a model from the registry

ollama pull orca

Listing local models

ollama list

Model packages

Overview

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

logo

Building

go build .

To run it start the server:

./ollama serve &

Finally, run a model!

./ollama run llama2

REST API

POST /api/generate

Generate text from a model.

curl -X POST http://localhost:11434/api/generate -d '{"model": "llama2", "prompt":"Why is the sky blue?"}'

POST /api/create

Create a model from a Modelfile.

curl -X POST http://localhost:11434/api/create -d '{"name": "my-model", "path": "/path/to/modelfile"}'

Tools using Ollama

  • LangChain integration - Set up all local, JS-based retrival + QA over docs in 5 minutes.
  • Continue - embeds Ollama inside Visual Studio Code. The extension lets you highlight code to add to the prompt, ask questions in the sidebar, and generate code inline.
  • Discord AI Bot - interact with Ollama as a chatbot on Discord.
  • Raycast Ollama - Raycast extension to use Ollama for local llama inference on Raycast.
  • Simple HTML UI for Ollama