Jeffrey Morgan
|
f7231ad9ad
|
set shutting_down to false once shutdown is complete (#2484)
|
2024-02-13 17:48:41 -08:00 |
|
Daniel Hiltgen
|
6680761596
|
Shutdown faster
Make sure that when a shutdown signal comes, we shutdown quickly instead
of waiting for a potentially long exchange to wrap up.
|
2024-02-08 22:22:50 -08:00 |
|
Daniel Hiltgen
|
72b12c3be7
|
Bump llama.cpp to b1999
This requires an upstream change to support graceful termination,
carried as a patch.
|
2024-01-30 16:52:12 -08:00 |
|
Daniel Hiltgen
|
730dcfcc7a
|
Refine debug logging for llm
This wires up logging in llama.cpp to always go to stderr, and also
turns up logging if OLLAMA_DEBUG is set.
|
2024-01-22 12:26:49 -08:00 |
|
Daniel Hiltgen
|
ec3764538d
|
Probe GPUs before backend init
Detect potential error scenarios so we can fallback to CPU mode without
hitting asserts.
|
2024-01-21 15:59:38 -08:00 |
|
Jeffrey Morgan
|
557110d0ba
|
Disable mmap with lora layers (#1985)
|
2024-01-13 23:36:31 -05:00 |
|
Jeffrey Morgan
|
2c6e8f5248
|
Update submodule to 6efb8eb30e7025b168f3fda3ff83b9b386428ad6 (#1885)
* update submodule to `6efb8eb30e7025b168f3fda3ff83b9b386428ad6`
* unblock condition variable in `update_slots` when closing server
|
2024-01-10 16:48:38 -05:00 |
|
Jeffrey Morgan
|
dbdd50b283
|
add -DCMAKE_SYSTEM_NAME=Darwin cmake flag (#1832)
|
2024-01-07 00:46:17 -05:00 |
|
Daniel Hiltgen
|
77d96da94b
|
Code shuffle to clean up the llm dir
|
2024-01-04 12:12:05 -08:00 |
|