ollama

1662 commits 1 branch 0 tags 13 MiB

Author	SHA1	Message	Date
Daniel Hiltgen	d4cd695759	Add cgo implementation for llama.cpp Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.	2023-12-19 09:05:46 -08:00
Bruce MacDonald	811b1f03c8	deprecate ggml - remove ggml runner - automatically pull gguf models when ggml detected - tell users to update to gguf in the case automatic pull fails Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-12-19 09:05:46 -08:00
Jing Zhang	82b9b329ff	windows CUDA support (#1262 ) * Support cuda build in Windows * Enable dynamic NumGPU allocation for Windows	2023-11-24 17:16:36 -05:00
Jeffrey Morgan	3a1ed9ff70	restore building runner with `AVX` on by default (#900 )	2023-10-27 12:13:44 -07:00
Michael Yang	c9167494cb	update default log target	2023-10-23 10:44:50 -07:00
Bruce MacDonald	5d22319a2c	rename server subprocess (#700 ) - this makes it easier to see that the subprocess is associated with ollama	2023-10-06 10:15:42 -04:00
Michael Yang	058d0cd04b	silence warm up log	2023-09-21 14:53:33 -07:00
Michael Yang	a9ed7cc6aa	rename generate.go	2023-09-20 14:42:17 -07:00

Renamed from llm/llama.cpp/generate.go (Browse further)