768 B
768 B
Ollama
An easy, fast runtime for large language models, powered by llama.cpp
.
Note: this project is a work in progress. Certain models that can be run with
ollama
are intended for research and/or non-commercial use only.
Install
Using pip
:
pip install ollama
Using docker
:
docker run ollama/ollama
Quickstart
To run a model, use ollama run
:
ollama run orca-mini-3b
You can also run models from hugging face:
ollama run huggingface.co/TheBloke/orca_mini_3B-GGML
Or directly via downloaded model files:
ollama run ~/Downloads/orca-mini-13b.ggmlv3.q4_0.bin
Building
go generate ./...
go build .