ollama/docs/development.md

# Development

- Install cmake or (optionally, required tools for GPUs)
- run `go generate ./...`
- run `go build .`

Install required tools:

- cmake version 3.24 or higher
- go version 1.20 or higher
- gcc version 11.4.0 or higher

```bash
brew install go cmake gcc
```

Optionally enable debugging and more verbose logging:

```bash
export CGO_CFLAGS="-g"
```

Get the required libraries and build the native LLM code:

```bash
go generate ./...
```

Then build ollama:

```bash
go build .
```

Now you can run `ollama`:

```bash
./ollama
```

## Building on Linux with GPU support


### Linux/Windows CUDA (NVIDIA)
*Your operating system distribution may already have packages for NVIDIA CUDA. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!*

Install `cmake` and `golang` as well as [NVIDIA CUDA](https://developer.nvidia.com/cuda-downloads) development and runtime packages.
Then generate dependencies:
```
go generate ./...
```
Then build the binary:
```
go build .
```

### Linux ROCm (AMD)
*Your operating system distribution may already have packages for AMD ROCm and CLBlast. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!*

Install [CLBlast](https://github.com/CNugteren/CLBlast/blob/master/doc/installation.md) and [ROCm](https://rocm.docs.amd.com/en/latest/deploy/linux/quick_start.html) developement packages first, as well as `cmake` and `golang`.
Adjust the paths below (correct for Arch) as appropriate for your distributions install locations and generate dependencies:
```
CLBlast_DIR=/usr/lib/cmake/CLBlast ROCM_PATH=/opt/rocm go generate ./...
```
Then build the binary:
```
go build .
```

ROCm requires elevated privileges to access the GPU at runtime.  On most distros you can add your user account to the `render` group, or run as root.

## Containerized Build

If you have Docker available, you can build linux binaries with `./scripts/build_linux.sh` which has the CUDA and ROCm dependencies included.
add development doc 2023-06-27 17:46:46 +00:00			`# Development`

subprocess llama.cpp server (#401) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm 2023-08-30 20:35:03 +00:00			`- Install cmake or (optionally, required tools for GPUs)`
			- run `go generate ./...`
			- run `go build .`

add publish script 2023-07-07 16:59:24 +00:00			`Install required tools:`
add development doc 2023-06-27 17:46:46 +00:00
first pass at linux gpu support (#454) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me> 2023-09-12 15:04:35 +00:00			`- cmake version 3.24 or higher`
			`- go version 1.20 or higher`
			`- gcc version 11.4.0 or higher`

add some missing code directives in docs (#664) 2023-10-01 18:51:01 +00:00			```bash
subprocess llama.cpp server (#401) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm 2023-08-30 20:35:03 +00:00			`brew install go cmake gcc`
add development doc 2023-06-27 17:46:46 +00:00			```

Quiet down llama.cpp logging by default By default builds will now produce non-debug and non-verbose binaries. To enable verbose logs in llama.cpp and debug symbols in the native code, set `CGO_CFLAGS=-g` 2023-12-22 16:47:18 +00:00			`Optionally enable debugging and more verbose logging:`

			```bash
			`export CGO_CFLAGS="-g"`
			```

			`Get the required libraries and build the native LLM code:`
Note that CGO must be enabled in dev docs 2023-07-21 20:36:36 +00:00
add some missing code directives in docs (#664) 2023-10-01 18:51:01 +00:00			```bash
subprocess llama.cpp server (#401) * remove c code * pack llama.cpp * use request context for llama_cpp * let llama_cpp decide the number of threads to use * stop llama runner when app stops * remove sample count and duration metrics * use go generate to get libraries * tmp dir for running llm 2023-08-30 20:35:03 +00:00			`go generate ./...`
Note that CGO must be enabled in dev docs 2023-07-21 20:36:36 +00:00			```

Some simple modelfile examples Signed-off-by: Matt Williams <m@technovangelist.com> 2023-07-18 00:16:59 +00:00			`Then build ollama:`
add development doc 2023-06-27 17:46:46 +00:00
add some missing code directives in docs (#664) 2023-10-01 18:51:01 +00:00			```bash
Some simple modelfile examples Signed-off-by: Matt Williams <m@technovangelist.com> 2023-07-18 00:16:59 +00:00			`go build .`
add development doc 2023-06-27 17:46:46 +00:00			```

add publish script 2023-07-07 16:59:24 +00:00			Now you can run `ollama`:
add development doc 2023-06-27 17:46:46 +00:00
add some missing code directives in docs (#664) 2023-10-01 18:51:01 +00:00			```bash
add publish script 2023-07-07 16:59:24 +00:00			`./ollama`
add development doc 2023-06-27 17:46:46 +00:00			```
first pass at linux gpu support (#454) * linux gpu support * handle multiple gpus * add cuda docker image (#488) --------- Co-authored-by: Michael Yang <mxyng@pm.me> 2023-09-12 15:04:35 +00:00
			`## Building on Linux with GPU support`

Refine build to support CPU only If someone checks out the ollama repo and doesn't install the CUDA library, this will ensure they can build a CPU only version 2023-12-14 01:26:47 +00:00
			`### Linux/Windows CUDA (NVIDIA)`
			`Your operating system distribution may already have packages for NVIDIA CUDA. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!`

			Install `cmake` and `golang` as well as [NVIDIA CUDA](https://developer.nvidia.com/cuda-downloads) development and runtime packages.
			`Then generate dependencies:`
			```
			`go generate ./...`
			```
			`Then build the binary:`
			```
			`go build .`
			```

			`### Linux ROCm (AMD)`
			`Your operating system distribution may already have packages for AMD ROCm and CLBlast. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!`

			Install [CLBlast](https://github.com/CNugteren/CLBlast/blob/master/doc/installation.md) and [ROCm](https://rocm.docs.amd.com/en/latest/deploy/linux/quick_start.html) developement packages first, as well as `cmake` and `golang`.
			`Adjust the paths below (correct for Arch) as appropriate for your distributions install locations and generate dependencies:`
			```
			`CLBlast_DIR=/usr/lib/cmake/CLBlast ROCM_PATH=/opt/rocm go generate ./...`
			```
			`Then build the binary:`
			```
			`go build .`
			```

			ROCm requires elevated privileges to access the GPU at runtime. On most distros you can add your user account to the `render` group, or run as root.

			`## Containerized Build`

			If you have Docker available, you can build linux binaries with `./scripts/build_linux.sh` which has the CUDA and ROCm dependencies included.