update jetson tutorial
This commit is contained in:
parent
2bdc320216
commit
85bdf14b56
1 changed files with 7 additions and 30 deletions
|
@ -1,38 +1,15 @@
|
||||||
# Running Ollama on NVIDIA Jetson Devices
|
# Running Ollama on NVIDIA Jetson Devices
|
||||||
|
|
||||||
With some minor configuration, Ollama runs well on [NVIDIA Jetson Devices](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/). The following has been tested on [JetPack 5.1.2](https://developer.nvidia.com/embedded/jetpack).
|
Ollama runs well on [NVIDIA Jetson Devices](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) and should run out of the box with the standard installation instructions.
|
||||||
|
|
||||||
NVIDIA Jetson devices are Linux-based embedded AI computers that are purpose-built for AI applications.
|
The following has been tested on [JetPack 5.1.2](https://developer.nvidia.com/embedded/jetpack), but should also work on JetPack 6.0.
|
||||||
|
|
||||||
Jetsons have an integrated GPU that is wired directly to the memory controller of the machine. For this reason, the `nvidia-smi` command is unrecognized, and Ollama proceeds to operate in "CPU only"
|
|
||||||
mode. This can be verified by using a monitoring tool like jtop.
|
|
||||||
|
|
||||||
In order to address this, we simply pass the path to the Jetson's pre-installed CUDA libraries into `ollama serve` (while in a tmux session). We then hardcode the num_gpu parameters into a cloned
|
|
||||||
version of our target model.
|
|
||||||
|
|
||||||
Prerequisites:
|
|
||||||
|
|
||||||
- curl
|
|
||||||
- tmux
|
|
||||||
|
|
||||||
Here are the steps:
|
|
||||||
|
|
||||||
- Install Ollama via standard Linux command (ignore the 404 error): `curl https://ollama.com/install.sh | sh`
|
- Install Ollama via standard Linux command (ignore the 404 error): `curl https://ollama.com/install.sh | sh`
|
||||||
- Stop the Ollama service: `sudo systemctl stop ollama`
|
|
||||||
- Start Ollama serve in a tmux session called ollama_jetson and reference the CUDA libraries path: `tmux has-session -t ollama_jetson 2>/dev/null || tmux new-session -d -s ollama_jetson
|
|
||||||
'LD_LIBRARY_PATH=/usr/local/cuda/lib64 ollama serve'`
|
|
||||||
- Pull the model you want to use (e.g. mistral): `ollama pull mistral`
|
- Pull the model you want to use (e.g. mistral): `ollama pull mistral`
|
||||||
- Create a new Modelfile specifically for enabling GPU support on the Jetson: `touch ModelfileMistralJetson`
|
- Start an interactive session: `ollama run mistral`
|
||||||
- In the ModelfileMistralJetson file, specify the FROM model and the num_gpu PARAMETER as shown below:
|
|
||||||
|
|
||||||
```
|
|
||||||
FROM mistral
|
|
||||||
PARAMETER num_gpu 999
|
|
||||||
```
|
|
||||||
|
|
||||||
- Create a new model from your Modelfile: `ollama create mistral-jetson -f ./ModelfileMistralJetson`
|
|
||||||
- Run the new model: `ollama run mistral-jetson`
|
|
||||||
|
|
||||||
If you run a monitoring tool like jtop you should now see that Ollama is using the Jetson's integrated GPU.
|
|
||||||
|
|
||||||
And that's it!
|
And that's it!
|
||||||
|
|
||||||
|
# Running Ollama in Docker
|
||||||
|
|
||||||
|
When running GPU accelerated applications in Docker, it is highly recommended to use [dusty-nv jetson-containers repo](https://github.com/dusty-nv/jetson-containers).
|
Loading…
Reference in a new issue