* Clean up documentation
Will probably need to update with PRs for new release.
Signed-off-by: Matt Williams <m@technovangelist.com>
* Correcting to fit in 0.1.15 changes
Signed-off-by: Matt Williams <m@technovangelist.com>
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* addressing comments
Signed-off-by: Matt Williams <m@technovangelist.com>
* more api cleanup
Signed-off-by: Matt Williams <m@technovangelist.com>
* its llava not llama
Signed-off-by: Matt Williams <m@technovangelist.com>
* Update docs/troubleshooting.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Updated hosting to server and documented all env vars
Signed-off-by: Matt Williams <m@technovangelist.com>
* remove last of the cli descriptions
Signed-off-by: Matt Williams <m@technovangelist.com>
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* update further per conversation with jeff earlier today
Signed-off-by: Matt Williams <m@technovangelist.com>
* cleanup the doc readme
Signed-off-by: Matt Williams <m@technovangelist.com>
* move upgrade to faq
Signed-off-by: Matt Williams <m@technovangelist.com>
* first change
Signed-off-by: Matt Williams <m@technovangelist.com>
* updated
Signed-off-by: Matt Williams <m@technovangelist.com>
* Update docs/faq.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* examples in parent
Signed-off-by: Matt Williams <m@technovangelist.com>
* add exapmle for create model.
Signed-off-by: Matt Williams <m@technovangelist.com>
* update faq
Signed-off-by: Matt Williams <m@technovangelist.com>
* update create model api
Signed-off-by: Matt Williams <m@technovangelist.com>
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/faq.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/troubleshooting.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* update the readme in docs
Signed-off-by: Matt Williams <m@technovangelist.com>
* update a few more things
Signed-off-by: Matt Williams <m@technovangelist.com>
* Update docs/troubleshooting.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/faq.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/modelfile.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Update docs/troubleshooting.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
---------
Signed-off-by: Matt Williams <m@technovangelist.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
By default builds will now produce non-debug and non-verbose binaries.
To enable verbose logs in llama.cpp and debug symbols in the
native code, set `CGO_CFLAGS=-g`
The default thread count logic was broken and resulted in 2x the number
of threads as it should on a hyperthreading CPU
resulting in thrashing and poor performance.
The windows native setup still needs some more work, but this gets it building
again and if you set the PATH properly, you can run the resulting exe on a cuda system.
This switches the default llama.cpp to be CPU based, and builds the GPU variants
as dynamically loaded libraries which we can select at runtime.
This also bumps the ROCm library to version 6 given 5.7 builds don't work
on the latest ROCm library that just shipped.
The build tags rocm or cuda must be specified to both go generate and go build.
ROCm builds should have both ROCM_PATH set (and the ROCM SDK present) as well
as CLBlast installed (for GGML) and CLBlast_DIR set in the environment to the
CLBlast cmake directory (likely /usr/lib/cmake/CLBlast). Build tags are also
used to switch VRAM detection between cuda and rocm implementations, using
added "accelerator_foo.go" files which contain architecture specific functions
and variables. accelerator_none is used when no tags are set, and a helper
function addRunner will ignore it if it is the chosen accelerator. Fix go
generate commands, thanks @deadmeu for testing.
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
* restore model load duration on generate response
- set model load duration on generate and chat done response
- calculate createAt time when response created
* remove checkpoints predict opts
* Update routes.go