No description
a8d22fe4b3
update generate to yield object |
||
---|---|---|
desktop | ||
docs | ||
ollama | ||
.gitignore | ||
build.py | ||
LICENSE | ||
poetry.lock | ||
pyproject.toml | ||
README.md | ||
requirements.txt |
Ollama
Run ai models locally.
Note: this project is a work in progress. The features below are still in development
Features
- Run models locally on macOS (Windows, Linux and other platforms coming soon)
- Ollama uses the fastest loader available for your platform and model (e.g. llama.cpp, Core ML and other loaders coming soon)
- Import models from local files
- Find and download models on Hugging Face and other sources (coming soon)
- Support for running and switching between multiple models at a time (coming soon)
- Native desktop experience (coming soon)
- Built-in memory (coming soon)
Install
pip install ollama
Install From Source
git clone git@github.com:jmorganca/ollama ollama
cd ollama
pip install -r requirements.txt
pip install -e .
Quickstart
% ollama run huggingface.co/TheBloke/orca_mini_3B-GGML
Pulling huggingface.co/TheBloke/orca_mini_3B-GGML...
Downloading [================ ] 66.67% 11.8MiB/s
...
...
...
> Hello
Hello, how may I help you?
Python SDK
Example
import ollama
ollama.generate("orca-mini-3b", "hi")
ollama.generate(model, message)
Generate a completion
ollama.generate("./llama-7b-ggml.bin", "hi")
ollama.models()
List available local models
models = ollama.models()
ollama.serve()
Serve the ollama http server
ollama.serve()
ollama.add(filepath)
Add a model by importing from a file
ollama.add("./path/to/model")
ollama.load(model)
Manually a model for generation
ollama.load("model")
ollama.unload(model)
Unload a model
ollama.unload("model")
ollama.pull(model)
Download a model
ollama.pull("huggingface.co/thebloke/llama-7b-ggml")
Coming Soon
ollama.search("query")
Search for compatible models that Ollama can run
ollama.search("llama-7b")