No description
3db89d2125
fix run arg parser |
||
---|---|---|
desktop | ||
docs | ||
ollama | ||
.gitignore | ||
build.py | ||
LICENSE | ||
models.json | ||
poetry.lock | ||
pyproject.toml | ||
README.md | ||
requirements.txt |
Ollama
Ollama is a tool for running any large language model on any machine. It's designed to be easy to use and fast, supporting the largest number of models possible by using the fastest loader available for your platform and model.
Note: this project is a work in progress.
Install
pip install ollama
Quickstart
To run a model, use ollama run
:
ollama run orca-mini-3b
You can also run models from hugging face:
ollama run huggingface.co/TheBloke/orca_mini_3B-GGML
Or directly via downloaded model files:
ollama run ~/Downloads/orca-mini-13b.ggmlv3.q4_0.bin
Python SDK
Example
import ollama
ollama.generate("orca-mini-3b", "hi")
ollama.generate(model, message)
Generate a completion
ollama.generate("./llama-7b-ggml.bin", "hi")
ollama.models()
List available local models
models = ollama.models()
ollama.load(model)
Manually a model for generation
ollama.load("model")
ollama.unload(model)
Unload a model
ollama.unload("model")
ollama.pull(model)
Download a model
ollama.pull("huggingface.co/thebloke/llama-7b-ggml")
Coming Soon
ollama.search("query")
Search for compatible models that Ollama can run
ollama.search("llama-7b")