ollama/README.md
2023-06-28 16:04:53 -04:00

2 KiB

Ollama

Run ai models locally.

Note: this project is a work in progress. The features below are still in development

Features

  • Run models locally on macOS (Windows, Linux and other platforms coming soon)
  • Ollama uses the fastest loader available for your platform and model (e.g. llama.cpp, Core ML and other loaders coming soon)
  • Import models from local files
  • Find and download models on Hugging Face and other sources (coming soon)
  • Support for running and switching between multiple models at a time (coming soon)
  • Native desktop experience (coming soon)
  • Built-in memory (coming soon)

Install

pip install ollama

Install From Source

git clone git@github.com:jmorganca/ollama ollama
cd ollama
pip install -r requirements.txt
pip install -e .

Quickstart

% ollama run huggingface.co/TheBloke/orca_mini_3B-GGML
Pulling huggingface.co/TheBloke/orca_mini_3B-GGML...
Downloading [================           ] 66.67% 11.8MiB/s

...
...
...

> Hello

Hello, how may I help you?

Python SDK

Example

import ollama
ollama.generate("orca-mini-3b", "hi")

ollama.generate(model, message)

Generate a completion

ollama.generate("./llama-7b-ggml.bin", "hi")

ollama.models()

List available local models

models = ollama.models()

ollama.serve()

Serve the ollama http server

ollama.serve()

ollama.add(filepath)

Add a model by importing from a file

ollama.add("./path/to/model")

ollama.load(model)

Manually a model for generation

ollama.load("model")

ollama.unload(model)

Unload a model

ollama.unload("model")

ollama.pull(model)

Download a model

ollama.pull("huggingface.co/thebloke/llama-7b-ggml")

Coming Soon

ollama.search("query")

Search for compatible models that Ollama can run

ollama.search("llama-7b")

Documentation