ollama/llm
Jesse Gross 03408f3437 server: Don't clear cmd when closing a server
Close can be called on an LLM server if the runner subprocess dies.
However, the Ollama scheduler code may not know about this yet and
still try to access it. In this case, it is important that 'cmd'
is still available as it is used to check on the status of the
subprocess. If this happens, Kill may be called twice on the subprocess -
that is fine.

In addition, model unloading may race with new accesses, so we should
hold a lock around this. This may result in the model being reloaded
after the first close call - this is also fine as close will be called
again later.
2024-10-09 20:39:04 -07:00
..
ext_server Re-introduce the llama package (#5034) 2024-10-08 08:53:54 -07:00
generate Fix build leakages (#7141) 2024-10-08 13:04:59 -07:00
llama.cpp@8962422b1c llm: update llama.cpp commit to 8962422 (#6618) 2024-09-03 21:12:39 -04:00
patches llm: add solar pro (preview) (#6846) 2024-09-17 18:11:26 -07:00
filetype.go Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322) 2024-05-23 13:21:49 -07:00
ggla.go update convert test to check result data 2024-07-31 10:59:38 -07:00
ggml.go Merge pull request #6260 from ollama/mxyng/mem 2024-09-05 13:22:08 -07:00
ggml_test.go llm: speed up gguf decoding by a lot (#5246) 2024-06-24 21:47:52 -07:00
gguf.go add conversion for microsoft phi 3 mini/medium 4k, 128 2024-08-12 15:13:29 -07:00
llm_darwin.go Optimize container images for startup (#6547) 2024-09-12 12:10:30 -07:00
llm_linux.go Optimize container images for startup (#6547) 2024-09-12 12:10:30 -07:00
llm_windows.go runner: Set windows above normal priority (#6905) 2024-09-21 16:54:49 -07:00
memory.go Improve logging on GPU too small (#6666) 2024-09-06 08:29:36 -07:00
memory_test.go llama3.1 2024-08-21 11:49:31 -07:00
server.go server: Don't clear cmd when closing a server 2024-10-09 20:39:04 -07:00
status.go Catch one more error log 2024-08-05 09:28:07 -07:00