58d95cc9bd
This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently. |
||
---|---|---|
.. | ||
ext_server | ||
generate | ||
llama.cpp@ad3a0505e3 | ||
patches | ||
ggla.go | ||
ggml.go | ||
gguf.go | ||
llm.go | ||
llm_darwin_amd64.go | ||
llm_darwin_arm64.go | ||
llm_linux.go | ||
llm_windows.go | ||
payload.go | ||
server.go | ||
status.go |