Taras Tsugrii
01ff2e14db
[nit] Remove unused msg local var. ( #2511 )
2024-02-20 14:02:34 -05:00
Jeffrey Morgan
f7231ad9ad
set shutting_down
to false
once shutdown is complete ( #2484 )
2024-02-13 17:48:41 -08:00
Daniel Hiltgen
6680761596
Shutdown faster
...
Make sure that when a shutdown signal comes, we shutdown quickly instead
of waiting for a potentially long exchange to wrap up.
2024-02-08 22:22:50 -08:00
Daniel Hiltgen
72b12c3be7
Bump llama.cpp to b1999
...
This requires an upstream change to support graceful termination,
carried as a patch.
2024-01-30 16:52:12 -08:00
Daniel Hiltgen
730dcfcc7a
Refine debug logging for llm
...
This wires up logging in llama.cpp to always go to stderr, and also
turns up logging if OLLAMA_DEBUG is set.
2024-01-22 12:26:49 -08:00
Daniel Hiltgen
ec3764538d
Probe GPUs before backend init
...
Detect potential error scenarios so we can fallback to CPU mode without
hitting asserts.
2024-01-21 15:59:38 -08:00
Daniel Hiltgen
1b249748ab
Add multiple CPU variants for Intel Mac
...
This also refines the build process for the ext_server build.
2024-01-17 15:08:54 -08:00
Jeffrey Morgan
557110d0ba
Disable mmap
with lora layers ( #1985 )
2024-01-13 23:36:31 -05:00
Daniel Hiltgen
8da7bef05f
Support multiple variants for a given llm lib type
...
In some cases we may want multiple variants for a given GPU type or CPU.
This adds logic to have an optional Variant which we can use to select
an optimal library, but also allows us to try multiple variants in case
some fail to load.
This can be useful for scenarios such as ROCm v5 vs v6 incompatibility
or potentially CPU features.
2024-01-10 17:27:51 -08:00
Jeffrey Morgan
2c6e8f5248
Update submodule to 6efb8eb30e7025b168f3fda3ff83b9b386428ad6
( #1885 )
...
* update submodule to `6efb8eb30e7025b168f3fda3ff83b9b386428ad6`
* unblock condition variable in `update_slots` when closing server
2024-01-10 16:48:38 -05:00
Jeffrey Morgan
dbdd50b283
add -DCMAKE_SYSTEM_NAME=Darwin
cmake flag ( #1832 )
2024-01-07 00:46:17 -05:00
Daniel Hiltgen
77d96da94b
Code shuffle to clean up the llm dir
2024-01-04 12:12:05 -08:00