Daniel Hiltgen
58d95cc9bd
Switch back to subprocessing for llama.cpp
...
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
Daniel Hiltgen
29e90cc13b
Implement new Go based Desktop app
...
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Daniel Hiltgen
d4cd695759
Add cgo implementation for llama.cpp
...
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Jason Jacobs
3d620f9462
ignore jetbrain ides ( #1287 )
2023-11-27 15:57:45 -05:00
Jing Zhang
82b9b329ff
windows CUDA support ( #1262 )
...
* Support cuda build in Windows
* Enable dynamic NumGPU allocation for Windows
2023-11-24 17:16:36 -05:00
Jeffrey Morgan
85e4441c6a
cache docker builds
2023-11-18 08:51:38 -05:00
Jeffrey Morgan
a82eb275ff
update docs for subprocess
2023-08-30 17:54:02 -04:00
Bruce MacDonald
42998d797d
subprocess llama.cpp server ( #401 )
...
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Jeffrey Morgan
67b6f8ba86
add ggml-metal.metal
to .gitignore
2023-07-28 11:04:21 -04:00
jk1jk
e6c427ce4d
Update .gitignore
2023-07-22 17:00:52 +03:00
Jeffrey Morgan
7c71c10d4f
fix compilation issue in Dockerfile, remove from README.md
until ready
2023-07-11 19:51:08 -07:00
Michael Yang
442dec1c6f
vendor llama.cpp
2023-07-11 11:59:18 -07:00
Michael Yang
fd4792ec56
call llama.cpp directly from go
2023-07-11 11:59:18 -07:00
Jeffrey Morgan
9fe018675f
use Makefile
for dependency building instead of go generate
2023-07-06 16:34:44 -04:00
Jeffrey Morgan
b0e986fb96
add binary to .gitignore
2023-07-06 16:34:44 -04:00
Bruce MacDonald
d34985b9df
add templates to prompt command
2023-06-26 13:41:16 -04:00
Jeffrey Morgan
b361fa72ec
reorganize directories
2023-06-25 13:08:03 -04:00
Jeffrey Morgan
d3709f85b5
build server into desktop app
2023-06-25 00:30:02 -04:00
Bruce MacDonald
c5bafaff54
package server with client
2023-06-23 18:38:22 -04:00
Bruce MacDonald
f0eee3faa0
build server executable
2023-06-23 17:23:30 -04:00
Bruce MacDonald
db81d81b23
Update .gitignore
2023-06-23 13:57:03 -04:00
Jeffrey Morgan
8fa91332fa
initial commit
2023-06-22 18:31:40 -04:00