Jagadish Krishnamoorthy
59d87127f5
Update gpu_info_rocm.c
2024-01-26 22:08:27 -08:00
Patrick Devine
b5cf31b460
add keep_alive to generate/chat/embedding api endpoints ( #2146 )
2024-01-26 14:28:02 -08:00
Daniel Hiltgen
cc4915e262
Merge pull request #2214 from dhiltgen/reject_cuda_without_avx
...
Detect lack of AVX and fallback to CPU mode
2024-01-26 12:06:44 -08:00
Daniel Hiltgen
667a2ba18a
Detect lack of AVX and fallback to CPU mode
...
We build the GPU libraries with AVX enabled to ensure that if not all
layers fit on the GPU we get better performance in a mixed mode.
If the user is using a virtualization/emulation system that lacks AVX
this used to result in an illegal instruction error and crash before this
fix. Now we will report a warning in the server log, and just use
CPU mode to ensure we don't crash.
2024-01-26 11:36:03 -08:00
Michael Yang
e054ebe059
Merge pull request #2212 from ollama/mxyng/fix-build
...
fix build
2024-01-26 11:19:08 -08:00
Michael Yang
9d3dcfd0ec
fix logging
2024-01-26 11:04:27 -08:00
Michael Yang
6e0ea5ecc8
Merge pull request #1916 from ollama/mxyng/inactivity-monitor
...
download: add inactivity monitor
2024-01-26 10:56:00 -08:00
Daniel Hiltgen
a47d8b2557
Merge pull request #2197 from dhiltgen/remove_rocm_image
...
Add back ROCm container support
2024-01-26 09:34:23 -08:00
Daniel Hiltgen
30c43c285c
Merge pull request #2195 from dhiltgen/rocm_real_gpus
...
Ignore AMD integrated GPUs
2024-01-26 09:30:24 -08:00
Daniel Hiltgen
23a7ea593b
Merge pull request #2209 from dhiltgen/harden_mgmt
...
Fix crash on cuda ml init failure
2024-01-26 09:30:13 -08:00
Daniel Hiltgen
75c44aa319
Add back ROCm container support
...
This adds ROCm support back as a discrete image.
2024-01-26 09:24:29 -08:00
Daniel Hiltgen
9d7b5d6c91
Ignore AMD integrated GPUs
...
Detect and ignore integrated GPUs reported by rocm.
2024-01-26 09:21:35 -08:00
Daniel Hiltgen
5d9c4a5f5a
Fix crash on cuda ml init failure
...
The new driver lookup code was triggering after init failure due to a missing return
2024-01-26 09:18:33 -08:00
Daniel Hiltgen
197e420a97
Merge pull request #2196 from dhiltgen/remove_rocm_image
...
Switch back to ubuntu base
2024-01-25 16:50:32 -08:00
Daniel Hiltgen
a34e1ad3cf
Switch back to ubuntu base
...
The size increase for rocm support in the standard image is problematic
We'll revisit multiple tags for rocm support in a follow up PR.
2024-01-25 16:46:01 -08:00
Michael Yang
2ae0556292
Merge pull request #1679 from ollama/mxyng/build-gpus
...
build cuda and rocm
2024-01-25 16:38:14 -08:00
Jeffrey Morgan
5be9bdd444
Update modelfile.md
2024-01-25 16:29:48 -08:00
Jeffrey Morgan
b706794905
Update modelfile.md to include MESSAGE
2024-01-25 16:29:32 -08:00
Michael Yang
a8c5413d06
only generate gpu libs
2024-01-25 15:41:56 -08:00
Michael Yang
5580de4571
archive ollama binaries
2024-01-25 15:40:16 -08:00
Michael Yang
946431d5b0
build cuda and rocm
2024-01-25 15:40:15 -08:00
Michael Yang
0610126049
remove env setting
2024-01-25 15:39:43 -08:00
Jeffrey Morgan
3ebd6a83fc
update submodule to cd4fddb29f81d6a1f6d51a0c016bc6b486d68def
2024-01-25 13:54:11 -08:00
Jeffrey Morgan
a64570dcae
Fix clearing kv cache between requests with the same prompt ( #2186 )
...
* Fix clearing kv cache between requests with the same prompt
* fix powershell script
2024-01-25 13:46:20 -08:00
Patrick Devine
7c40a67841
Save and load sessions ( #2063 )
2024-01-25 12:12:36 -08:00
Michael Yang
e64b5b07a2
Merge pull request #2181 from ollama/mxyng/stub-lint
...
stub generate outputs for lint
2024-01-25 11:55:15 -08:00
Michael Yang
9e1e295cdc
Merge pull request #2175 from ollama/mxyng/refactor-tensor-read
...
refactor tensor read
2024-01-25 09:22:42 -08:00
Jeffrey Morgan
a643823f86
Update README.md
2024-01-24 21:36:56 -08:00
Michael Yang
8e5d359a03
stub generate outputs for lint
2024-01-24 17:36:10 -08:00
Daniel Hiltgen
a170888dd4
Merge pull request #2174 from dhiltgen/rocm_real_gpus
...
More logging for gpu management
2024-01-24 11:09:17 -08:00
Michael Yang
cd22855ef8
refactor tensor read
2024-01-24 10:48:31 -08:00
Daniel Hiltgen
013fd07139
More logging for gpu management
...
Fix an ordering glitch of dlerr/dlclose and add more logging to help
root cause some crashes users are hitting. This also refines the
function pointer names to use the underlying function names instead
of simplified names for readability.
2024-01-24 10:32:36 -08:00
Daniel Hiltgen
f63dc2db5c
Merge pull request #2162 from dhiltgen/rocm_real_gpus
...
Report more information about GPUs in verbose mode
2024-01-23 17:45:40 -08:00
Jeffrey Morgan
eaa5a396d9
Update README.md
2024-01-23 16:08:15 -08:00
Jeffrey Morgan
8ed22f5d72
Update README.md
2024-01-23 14:38:01 -08:00
Daniel Hiltgen
987c16b2f7
Report more information about GPUs in verbose mode
...
This adds additional calls to both CUDA and ROCm management libraries to
discover additional attributes about the GPU(s) detected in the system, and
wires up runtime verbosity selection. When users hit problems with GPUs we can
ask them to run with `OLLAMA_DEBUG=1 ollama serve` and share the results.
2024-01-23 11:37:02 -08:00
Jeffrey Morgan
950f636d64
Update README.md
2024-01-23 10:29:10 -08:00
Jeffrey Morgan
4458efb73a
Load all layers on arm64
macOS if model is small enough ( #2149 )
2024-01-22 17:40:06 -08:00
Daniel Hiltgen
ceea599494
Merge pull request #2150 from dhiltgen/default_version
...
Set a default version using git describe
2024-01-22 17:38:27 -08:00
Daniel Hiltgen
3005ec74b3
Set a default version using git describe
...
If a VERSION is not specified, this will generate a version string that
represents the state of the repo. For example `0.1.21-12-gffaf52e-dirty`
representing 12 commits away from 0.1.21 tag, on commit gffaf52e
and the tree is dirty.
2024-01-22 17:12:20 -08:00
Daniel Hiltgen
0759d8996e
Merge pull request #2148 from dhiltgen/intel_mac
...
Refine Accelerate usage on mac
2024-01-22 16:56:58 -08:00
Daniel Hiltgen
0f5b843319
Refine Accelerate usage on mac
...
For old macs, accelerate seems to cause crashes, but for
AVX2 capable macs, it does not.
2024-01-22 16:25:56 -08:00
Jeffrey Morgan
ffaf52e1e9
update submodule to 011e8ec577fd135cbc02993d3ea9840c516d6a1c
2024-01-22 15:16:54 -08:00
Michael Yang
940b10b036
Merge pull request #2144 from jmorganca/mxyng/update-faq
...
faq: update to use launchctl setenv
2024-01-22 13:46:57 -08:00
Daniel Hiltgen
3bc28736cd
Merge pull request #2143 from dhiltgen/llm_verbosity
...
Refine debug logging for llm
2024-01-22 13:19:16 -08:00
Michael Yang
93a756266c
faq: update to use launchctl setenv
2024-01-22 13:10:13 -08:00
Daniel Hiltgen
a0a829bf7a
Merge pull request #2142 from dhiltgen/debug_on_fail
...
Debug logging on init failure
2024-01-22 12:29:22 -08:00
Daniel Hiltgen
730dcfcc7a
Refine debug logging for llm
...
This wires up logging in llama.cpp to always go to stderr, and also
turns up logging if OLLAMA_DEBUG is set.
2024-01-22 12:26:49 -08:00
Daniel Hiltgen
27a2d5af54
Debug logging on init failure
2024-01-22 12:08:22 -08:00
Jeffrey Morgan
5f81a33f43
update submodule to 6f9939d
( #2115 )
2024-01-22 11:56:40 -08:00