Add Windows arm64 support to official builds (#5712 )

* Unified arm/x86 windows installer

This adjusts the installer payloads to be architecture aware so we can cary
both amd64 and arm64 binaries in the installer, and install only the applicable
architecture at install time.

* Include arm64 in official windows build

* Harden schedule test for slow windows timers

This test seems to be a bit flaky on windows, so give it more time to converge

2024-09-20 13:09:38 -07:00

5.7 KiB

Raw Blame History

Development

Install required tools:

cmake version 3.24 or higher
go version 1.22 or higher
gcc version 11.4.0 or higher

MacOS

brew install go cmake gcc

Optionally enable debugging and more verbose logging:

# At build time
export CGO_CFLAGS="-g"

# At runtime
export OLLAMA_DEBUG=1

Get the required libraries and build the native LLM code:

go generate ./...

Then build ollama:

go build .

Now you can run ollama:

./ollama

Linux

Linux CUDA (NVIDIA)

Your operating system distribution may already have packages for NVIDIA CUDA. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!

Install cmake and golang as well as NVIDIA CUDA development and runtime packages.

Typically the build scripts will auto-detect CUDA, however, if your Linux distro or installation approach uses unusual paths, you can specify the location by specifying an environment variable CUDA_LIB_DIR to the location of the shared libraries, and CUDACXX to the location of the nvcc compiler. You can customize a set of target CUDA architectures by setting CMAKE_CUDA_ARCHITECTURES (e.g. "50;60;70")

Then generate dependencies:

go generate ./...

Then build the binary:

go build .

Linux ROCm (AMD)

Your operating system distribution may already have packages for AMD ROCm and CLBlast. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!

Install CLBlast and ROCm development packages first, as well as cmake and golang.

Typically the build scripts will auto-detect ROCm, however, if your Linux distro or installation approach uses unusual paths, you can specify the location by specifying an environment variable ROCM_PATH to the location of the ROCm install (typically /opt/rocm), and CLBlast_DIR to the location of the CLBlast install (typically /usr/lib/cmake/CLBlast). You can also customize the AMD GPU targets by setting AMDGPU_TARGETS (e.g. AMDGPU_TARGETS="gfx1101;gfx1102")

go generate ./...

Then build the binary:

go build .

ROCm requires elevated privileges to access the GPU at runtime. On most distros you can add your user account to the render group, or run as root.

Advanced CPU Settings

By default, running go generate ./... will compile a few different variations of the LLM library based on common CPU families and vector math capabilities, including a lowest-common-denominator which should run on almost any 64 bit CPU somewhat slowly. At runtime, Ollama will auto-detect the optimal variation to load. If you would like to build a CPU-based build customized for your processor, you can set OLLAMA_CUSTOM_CPU_DEFS to the llama.cpp flags you would like to use. For example, to compile an optimized binary for an Intel i9-9880H, you might use:

OLLAMA_CUSTOM_CPU_DEFS="-DGGML_AVX=on -DGGML_AVX2=on -DGGML_F16C=on -DGGML_FMA=on" go generate ./...
go build .

Containerized Linux Build

If you have Docker available, you can build linux binaries with ./scripts/build_linux.sh which has the CUDA and ROCm dependencies included. The resulting binary is placed in ./dist

Windows

Note: The Windows build for Ollama is still under development.

First, install required tools:

MSVC toolchain - C/C++ and cmake as minimal requirements
Go version 1.22 or higher
MinGW (pick one variant) with GCC.
- MinGW-w64
- MSYS2
The ThreadJob Powershell module: Install-Module -Name ThreadJob -Scope CurrentUser

Then, build the ollama binary:

$env:CGO_ENABLED="1"
go generate ./...
go build .

Windows CUDA (NVIDIA)

In addition to the common Windows development tools described above, install CUDA after installing MSVC.

NVIDIA CUDA

Windows ROCm (AMD Radeon)

In addition to the common Windows development tools described above, install AMDs HIP package after installing MSVC.

Lastly, add ninja.exe included with MSVC to the system path (e.g. C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja).

Windows arm64

The default Developer PowerShell for VS 2022 may default to x86 which is not what you want. To ensure you get an arm64 development environment, start a plain PowerShell terminal and run:

import-module 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\Common7\\Tools\\Microsoft.VisualStudio.DevShell.dll'
Enter-VsDevShell -Arch arm64 -vsinstallpath 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community' -skipautomaticlocation

You can confirm with write-host $env:VSCMD_ARG_TGT_ARCH

Follow the instructions at https://www.msys2.org/wiki/arm64/ to set up an arm64 msys2 environment. Ollama requires gcc and mingw32-make to compile, which is not currently available on Windows arm64, but a gcc compatibility adapter is available via mingw-w64-clang-aarch64-gcc-compat. At a minimum you will need to install the following:

pacman -S mingw-w64-clang-aarch64-clang mingw-w64-clang-aarch64-gcc-compat mingw-w64-clang-aarch64-make make

You will need to ensure your PATH includes go, cmake, gcc and clang mingw32-make to build ollama from source. (typically C:\msys64\clangarm64\bin\)

5.7 KiB Raw Blame History