ollama/llm
Daniel Hiltgen 58d95cc9bd Switch back to subprocessing for llama.cpp
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems.  This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
..
ext_server Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
generate Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llama.cpp@ad3a0505e3 Bump llama.cpp to b2527 2024-03-25 13:47:44 -07:00
patches Bump llama.cpp to b2474 2024-03-23 09:54:56 +01:00
ggla.go refactor model parsing 2024-04-01 13:16:15 -07:00
ggml.go update memory calcualtions 2024-04-01 13:16:32 -07:00
gguf.go refactor model parsing 2024-04-01 13:16:15 -07:00
llm.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_darwin_amd64.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_darwin_arm64.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_linux.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_windows.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
payload.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
server.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
status.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00