Jeffrey Morgan
63efa075a0
update generate scripts with new LLAMA_CUDA
variable, set HIP_PLATFORM
to avoid compiler errors ( #3528 )
2024-04-07 19:29:51 -04:00
Daniel Hiltgen
dfe330fa1c
Merge pull request #3488 from mofanke/fix-windows-dll-compress
...
fix dll compress in windows building
2024-04-04 16:12:13 -07:00
Daniel Hiltgen
36bd967722
Fail fast if mingw missing on windows
2024-04-04 09:51:26 -07:00
mofanke
4de0126719
fix dll compress in windows building
2024-04-04 21:27:33 +08:00
Daniel Hiltgen
e4a7e5b2ca
Fix CI release glitches
...
The subprocess change moved the build directory
arm64 builds weren't setting cross-compilation flags when building on x86
2024-04-03 16:41:40 -07:00
Jeffrey Morgan
cd135317d2
Fix macOS builds on older SDKs ( #3467 )
2024-04-03 10:45:54 -07:00
Daniel Hiltgen
58d95cc9bd
Switch back to subprocessing for llama.cpp
...
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
Jeffrey Morgan
856b8ec131
remove need for $VSINSTALLDIR
since build will fail if ninja
cannot be found ( #3350 )
2024-03-26 16:23:16 -04:00
Jeremy
dfc6721b20
add support for libcudart.so for CUDA devices (adds Jetson support)
2024-03-25 11:07:44 -04:00
Daniel Hiltgen
ab3456207b
Merge pull request #3028 from ollama/ci_release
...
CI release process
2024-03-15 16:40:54 -07:00
Daniel Hiltgen
6ad414f31e
Merge pull request #3086 from dhiltgen/import_server
...
Import server.cpp to retain llava support
2024-03-15 16:10:35 -07:00
Daniel Hiltgen
d4c10df2b0
Add Radeon gfx940-942 GPU support
2024-03-15 15:34:58 -07:00
Daniel Hiltgen
540f4af45f
Wire up more complete CI for releases
...
Flesh out our github actions CI so we can build official releaes.
2024-03-15 12:37:36 -07:00
Daniel Hiltgen
85129d3a32
Adapt our build for imported server.cpp
2024-03-12 14:57:15 -07:00
Jeffrey Morgan
369eda65f5
update llama.cpp submodule to ceca1ae
( #3064 )
2024-03-11 12:57:48 -07:00
Daniel Hiltgen
bc13da2bfe
Avoid rocm runner and dependency clash
...
Putting the rocm symlink next to the runners is risky. This moves
the payloads into a subdir to avoid potential clashes.
2024-03-11 09:33:22 -07:00
Daniel Hiltgen
3dc1bb6a35
Harden for deps file being empty (or short)
2024-03-10 14:45:38 -07:00
Jeffrey Morgan
e11668aa07
add bundle_metal
and cleanup_metal
funtions to gen_darwin.sh
2024-03-09 16:04:57 -08:00
Jeffrey Morgan
1ffb1e2874
update llama.cpp submodule to 77d1ac7
( #3030 )
2024-03-09 15:55:34 -08:00
Daniel Hiltgen
6c5ccb11f9
Revamp ROCm support
...
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed. It also cleans up after itself.
We now build only a single ROCm version (latest major) on both windows
and linux. Given the large size of ROCms tensor files, we split the
dependency out. It's bundled into the installer on windows, and a
separate download on windows. The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.
For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
John
23ebe8fe11
fix some typos ( #2973 )
...
Signed-off-by: hishope <csqiye@126.com>
2024-03-06 22:50:11 -08:00
Bernhard M. Wiedemann
76e5d9ec88
Omit build date from gzip headers
...
See https://reproducible-builds.org/ for why this is good.
This patch was done while working on reproducible builds for openSUSE.
2024-02-29 16:48:19 +01:00
Jeffrey Morgan
efe040f8c0
reset with init_vars
ahead of each cpu build in gen_windows.ps1
( #2654 )
2024-02-21 16:35:34 -05:00
Daniel Hiltgen
4fcbf1cde6
Merge pull request #2599 from dhiltgen/fix_avx
...
Explicitly disable AVX2 on GPU builds
2024-02-19 13:13:05 -08:00
Daniel Hiltgen
df6dc4fd96
Fix duplicate menus on update and exit on signals
...
Also fixes a few fit-and-finish items for better developer experience
2024-02-16 15:33:16 -08:00
Daniel Hiltgen
db2a9ad1fe
Explicitly disable AVX2 on GPU builds
...
Even though we weren't setting it to on, somewhere in the cmake config
it was getting toggled on. By explicitly setting it to off, we get `/arch:AVX`
as intended.
2024-02-15 14:50:11 -08:00
Daniel Hiltgen
29e90cc13b
Implement new Go based Desktop app
...
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Daniel Hiltgen
6d84f07505
Detect AMD GPU info via sysfs and block old cards
...
This wires up some new logic to start using sysfs to discover AMD GPU
information and detects old cards we can't yet support so we can fallback to CPU mode.
2024-02-12 08:19:41 -08:00
Daniel Hiltgen
27aa2d4a19
Merge pull request #1849 from mraiser/main
...
Accomodate split cuda lib dir
2024-02-05 16:01:16 -08:00
Daniel Hiltgen
e1f50377f4
Harden generate patching model
...
Only apply patches if we have any, and make sure to cleanup
every file we patched at the end to leave the tree clean
2024-02-01 19:34:36 -08:00
mraiser
4c4c730a0a
Merge branch 'ollama:main' into main
2024-01-27 21:56:11 -05:00
Daniel Hiltgen
e02ecfb6c8
Merge pull request #2116 from dhiltgen/cc_50_80
...
Add support for CUDA 5.0 cards
2024-01-27 10:28:38 -08:00
Jeffrey Morgan
a64570dcae
Fix clearing kv cache between requests with the same prompt ( #2186 )
...
* Fix clearing kv cache between requests with the same prompt
* fix powershell script
2024-01-25 13:46:20 -08:00
mraiser
a4564232a4
Update gen_linux.sh to find libcudart in separate directory
2024-01-25 09:49:35 -05:00
Daniel Hiltgen
0f5b843319
Refine Accelerate usage on mac
...
For old macs, accelerate seems to cause crashes, but for
AVX2 capable macs, it does not.
2024-01-22 16:25:56 -08:00
Daniel Hiltgen
df54c723ae
Make CPU builds parallel and customizable AMD GPUs
...
The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.
2024-01-21 15:12:21 -08:00
Daniel Hiltgen
a447a083f2
Add compute capability 5.0, 7.5, and 8.0
2024-01-20 14:24:05 -08:00
Daniel Hiltgen
681a914990
Add support for CUDA 5.2 cards
2024-01-20 10:48:43 -08:00
Jeffrey Morgan
4c54f0ddeb
sign dylibs on macOS ( #2101 )
2024-01-19 19:24:11 -05:00
Jeffrey Morgan
dc88cc3981
use gzip
for runner embedding ( #2067 )
2024-01-19 13:23:03 -05:00
Daniel Hiltgen
fccdf4c635
Merge pull request #1987 from xyproto/archlinux
...
Let gpu.go and gen_linux.sh also find CUDA on Arch Linux
2024-01-18 13:32:10 -08:00
Daniel Hiltgen
1b249748ab
Add multiple CPU variants for Intel Mac
...
This also refines the build process for the ext_server build.
2024-01-17 15:08:54 -08:00
Alexander F. Rødseth
cbe2adc78a
Merge branch 'main' into archlinux
2024-01-17 12:50:11 +01:00
Daniel Hiltgen
795674dd90
Bump llama.cpp to b1842 and add new cuda lib dep
...
Upstream llama.cpp has added a new dependency with the
NVIDIA CUDA Driver Libraries (libcuda.so) which is part of the
driver distribution, not the general cuda libraries, and is not
available as an archive, so we can not statically link it. This may
introduce some additional compatibility challenges which we'll
need to keep an eye on.
2024-01-16 12:53:52 -08:00
Daniel Hiltgen
8795447dad
Merge pull request #1966 from fpreiss/fpreiss/gen_linux_cuda_detection
...
improve cuda detection (rel. issue #1704 )
2024-01-14 18:00:11 -08:00
Daniel Hiltgen
3ca5f69ce8
Fix typo in arm mac arch script
2024-01-14 08:32:57 -08:00
Alexander F. Rødseth
f4bf1d514f
Let gpu.go and gen_linux.sh also find CUDA on Arch Linux
2024-01-14 13:40:36 +01:00
Daniel Hiltgen
2ecb247276
Fix intel mac build
...
Make sure we're building an x86 ext_server lib when cross-compiling
2024-01-13 14:46:34 -08:00
Jeffrey Morgan
288ef8ff95
add gcc -lstdc++
flag for linux cpu ( #1974 )
2024-01-13 03:53:00 -05:00
Jeffrey Morgan
4cf17990f7
use g++ to build libext_server.so
on linux ( #1972 )
2024-01-13 03:12:42 -05:00