Commit graph

42 commits

Author SHA1 Message Date
696e20eeae
Merge https://github.com/ollama/ollama
Signed-off-by: baalajimaestro <me@baalajimaestro.me>
2024-07-16 21:50:57 +05:30
Daniel Hiltgen
224337b32f Bump linux ROCm to 6.1.2 2024-07-15 15:10:22 -07:00
3bb134eaa0
Use alpine and remove blas
Signed-off-by: baalajimaestro <me@baalajimaestro.me>
2024-07-07 23:18:27 +05:30
Daniel Hiltgen
020bd60ab2 Switch amd container image base to rocky 8
The centos 7 arm mirrors have disappeared due to the EOL 2 days
ago, and the vault sed workaround which works for x86 doesn't work for arm.
2024-07-02 10:34:47 -07:00
55ce7d9fc2
Make run model a oneshot service
Signed-off-by: baalajimaestro <me@baalajimaestro.me>
2024-07-01 18:36:49 +05:30
e8f73063d0
Add building on oneapi
Also handle model names easily for docker

Signed-off-by: baalajimaestro <me@baalajimaestro.me>
2024-07-01 16:50:29 +05:30
Daniel Hiltgen
26ab67732b Bump ROCm linux to 6.1.1 2024-06-14 15:37:54 -07:00
Jeremy
8aec92fa6d rearranged conditional logic for static build, dockerfile updated 2024-04-17 14:43:28 -04:00
Jeremy
70261b9bb6 move static build to its own flag 2024-04-17 13:04:28 -04:00
Daniel Hiltgen
c2d813bdc3 Fix rocm deps with new subprocess paths 2024-04-11 12:52:06 -07:00
Daniel Hiltgen
58d95cc9bd Switch back to subprocessing for llama.cpp
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems.  This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
Daniel Hiltgen
c91a4ebcff Bump ROCm to 6.0.2 patch release 2024-03-28 15:58:57 -07:00
Patrick Devine
1b272d5bcd
change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
Daniel Hiltgen
e0319bd78d Revert "Switch arm cuda base image to centos 7"
This reverts commit 5dacc1ebe8.
2024-03-25 19:01:11 -07:00
Daniel Hiltgen
5dacc1ebe8 Switch arm cuda base image to centos 7
We had started using rocky linux 8, but they've updated to GCC 10.3,
which breaks NVCC.  10.2 is compatible (or 10.4, but that's not
available from rocky linux 8 repos yet)
2024-03-25 15:57:08 -07:00
Bruce MacDonald
a5ba0fcf78
doc: faq gpu compatibility (#3142) 2024-03-21 05:21:34 -04:00
Daniel Hiltgen
540f4af45f Wire up more complete CI for releases
Flesh out our github actions CI so we can build official releaes.
2024-03-15 12:37:36 -07:00
Jeffrey Morgan
b5fcd9d3aa
use -trimpath when building releases (#3069) 2024-03-11 15:58:46 -07:00
Daniel Hiltgen
82ca694d68
Rename ROCm deps file to avoid confusion (#3025) 2024-03-09 17:48:38 -08:00
Daniel Hiltgen
6c5ccb11f9 Revamp ROCm support
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
Jeffrey Morgan
d481fb3cc8
update go to 1.22 in other places (#2975) 2024-03-07 07:39:49 -08:00
Daniel Hiltgen
794a916a72 Add env var so podman will map cuda GPUs
Without this env var, podman's GPU logic doesn't map the GPU through
2024-02-29 08:43:08 -08:00
Daniel Hiltgen
75c44aa319 Add back ROCm container support
This adds ROCm support back as a discrete image.
2024-01-26 09:24:29 -08:00
Daniel Hiltgen
a34e1ad3cf Switch back to ubuntu base
The size increase for rocm support in the standard image is problematic
We'll revisit multiple tags for rocm support in a follow up PR.
2024-01-25 16:46:01 -08:00
Daniel Hiltgen
df54c723ae Make CPU builds parallel and customizable AMD GPUs
The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.
2024-01-21 15:12:21 -08:00
Daniel Hiltgen
da72235ebf Combine the 2 Dockerfiles and add ROCm
This renames Dockerfile.build to Dockerfile, and adds some new stages
to support 2 modes of building - the build_linux.sh script uses
intermediate stages to extract the artifacts for ./dist, and the default
build generates a container image usable by both cuda and rocm cards.
This required transitioniing the x86 base to the rocm image to avoid
layer bloat.
2024-01-21 11:37:11 -08:00
Michael Yang
0409c1fa59
docker: set PATH, LD_LIBRARY_PATH, and capabilities (#1336)
* docker: set PATH, LD_LIBRARY_PATH, and capabilities

* example: update k8s gpu manifest
2023-11-30 21:16:56 -08:00
Jeffrey Morgan
89ba19feca use Go 1.21.3 in Dockerfile 2023-10-12 23:23:12 -04:00
Jeffrey Morgan
dc87e9c9ae update Dockerfile to pass GOFLAGS 2023-10-03 07:05:15 -07:00
Michael Yang
0a4f21c0a7
fix docker build (#659) 2023-09-30 13:34:01 -07:00
Jeffrey Morgan
9abb66254a docker: fix volume permission errors 2023-09-30 12:32:15 -07:00
Michael Yang
92d454ec5f update build_darwin.sh 2023-09-29 11:29:23 -07:00
Jeffrey Morgan
2ded8ab206 use 11.8.0 nvidia dockerfile base image for now 2023-09-26 21:48:41 -07:00
Michael Yang
93d3a2568d replace dockerfile 2023-09-22 11:57:38 -07:00
Michael Yang
9aa192c812 update cuda docker image 2023-09-14 11:25:20 -07:00
Michael Yang
9795b43d93 update dockerfile 2023-09-06 15:31:25 -07:00
Jeffrey Morgan
7c71c10d4f fix compilation issue in Dockerfile, remove from README.md until ready 2023-07-11 19:51:08 -07:00
Jeffrey Morgan
ea809df196 update Dockerfile to use OLLAMA_HOST 2023-07-07 23:43:50 -04:00
Jeffrey Morgan
fdb93ef2aa fix dockerfile 2023-07-06 16:34:44 -04:00
Jeffrey Morgan
6292f4b64c update Dockerfile 2023-07-06 16:34:44 -04:00
Jeffrey Morgan
48920c873b add basic Dockerfile 2023-06-30 12:19:04 -04:00
Jeffrey Morgan
54a94566f1 add Dockerfile 2023-06-30 10:47:55 -04:00