Commit graph

11 commits

Author SHA1 Message Date
Daniel Hiltgen
325d74985b Fix CPU performance on hyperthreaded systems
The default thread count logic was broken and resulted in 2x the number
of threads as it should on a hyperthreading CPU
resulting in thrashing and poor performance.
2023-12-21 16:23:36 -08:00
Daniel Hiltgen
9adca7f711 Bump llama.cpp to b1662 and set n_parallel=1 2023-12-19 09:05:46 -08:00
Daniel Hiltgen
35934b2e05 Adapted rocm support to cgo based llama.cpp 2023-12-19 09:05:46 -08:00
Daniel Hiltgen
d4cd695759 Add cgo implementation for llama.cpp
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Bruce MacDonald
811b1f03c8 deprecate ggml
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00
Michael Yang
a00fac4ec8 update llama.cpp 2023-11-21 09:50:02 -08:00
Jeffrey Morgan
b0c9cd0f3b fix metal assertion errors 2023-10-24 00:32:36 -07:00
Michael Yang
c9167494cb update default log target 2023-10-23 10:44:50 -07:00
Bruce MacDonald
f3648fd206
Update llama.cpp gguf to latest (#710) 2023-10-17 16:55:16 -04:00
Michael Yang
058d0cd04b silence warm up log 2023-09-21 14:53:33 -07:00
Michael Yang
6c6a31a1e8 embed libraries using cmake 2023-09-20 14:41:57 -07:00