# Ollama Run ai models locally. > _Note: this project is a work in progress. The features below are still in development_ **Features** - Run models locally on macOS (Windows, Linux and other platforms coming soon) - Ollama uses the fastest loader available for your platform and model (e.g. llama.cpp, Core ML and other loaders coming soon) - Import models from local files - Find and download models on Hugging Face and other sources (coming soon) - Support for running and switching between multiple models at a time (coming soon) - Native desktop experience (coming soon) - Built-in memory (coming soon) ## Install ``` pip install ollama ``` ## Quickstart ``` % ollama run huggingface.co/TheBloke/orca_mini_3B-GGML Pulling huggingface.co/TheBloke/orca_mini_3B-GGML... Downloading [================> ] 66.67% (2/3) 30.2MB/s ... ... ... > Hello Hello, how may I help you? ``` ## Python SDK ### Example ```python import ollama ollama.generate("orca-mini-3b", "hi") ``` ### `ollama.generate(model, message)` Generate a completion ```python ollama.generate("./llama-7b-ggml.bin", "hi") ``` ### `ollama.models()` List available local models ```python models = ollama.models() ``` ### `ollama.serve()` Serve the ollama http server ``` ollama.serve() ``` ### `ollama.add(filepath)` Add a model by importing from a file ```python ollama.add("./path/to/model") ``` ### `ollama.load(model)` Manually a model for generation ```python ollama.load("model") ``` ### `ollama.unload(model)` Unload a model ```python ollama.unload("model") ``` ## Cooming Soon ### `ollama.pull(model)` Download a model ```python ollama.pull("huggingface.co/thebloke/llama-7b-ggml") ``` ### `ollama.search("query")` Search for compatible models that Ollama can run ```python ollama.search("llama-7b") ``` ## Documentation - [Development](docs/development.md)