Running Ollama on NVIDIA Jetson Devices

With some minor configuration, Ollama runs well on NVIDIA Jetson Devices. The following has been tested on JetPack 5.1.2.

NVIDIA Jetson devices are Linux-based embedded AI computers that are purpose-built for AI applications.

Jetsons have an integrated GPU that is wired directly to the memory controller of the machine. For this reason, the nvidia-smi command is unrecognized, and Ollama proceeds to operate in "CPU only" mode. This can be verified by using a monitoring tool like jtop.

In order to address this, we simply pass the path to the Jetson's pre-installed CUDA libraries into ollama serve (while in a tmux session). We then hardcode the num_gpu parameters into a cloned version of our target model.

Prerequisites:

curl
tmux

Here are the steps:

Install Ollama via standard Linux command (ignore the 404 error): curl https://ollama.com/install.sh | sh
Stop the Ollama service: sudo systemctl stop ollama
Start Ollama serve in a tmux session called ollama_jetson and reference the CUDA libraries path: tmux has-session -t ollama_jetson 2>/dev/null || tmux new-session -d -s ollama_jetson 'LD_LIBRARY_PATH=/usr/local/cuda/lib64 ollama serve'
Pull the model you want to use (e.g. mistral): ollama pull mistral
Create a new Modelfile specifically for enabling GPU support on the Jetson: touch ModelfileMistralJetson
In the ModelfileMistralJetson file, specify the FROM model and the num_gpu PARAMETER as shown below:

FROM mistral
PARAMETER num_gpu 999

Create a new model from your Modelfile: ollama create mistral-jetson -f ./ModelfileMistralJetson
Run the new model: ollama run mistral-jetson

If you run a monitoring tool like jtop you should now see that Ollama is using the Jetson's integrated GPU.

And that's it!

1.9 KiB Raw Blame History

Running Ollama on NVIDIA Jetson Devices

1.9 KiB

Raw Blame History