Commit graph

2113 commits

Author SHA1 Message Date
Jeffrey Morgan
d4ebdadbe7 enable cache_prompt by default 2023-12-27 14:23:42 -05:00
Daniel Hiltgen
e201efa14b Add windows native build instructions 2023-12-25 08:31:34 -08:00
Icelain
c5f21f73a4
follow best practices by adding resp.Body.Close() (#1708) 2023-12-25 09:01:37 -05:00
Jeffrey Morgan
371bc73531
Update README.md 2023-12-24 11:54:08 -05:00
Jeffrey Morgan
c651d8b824
Update README.md 2023-12-23 11:18:12 -05:00
Daniel Hiltgen
cf50ef5b51
Merge pull request #1684 from dhiltgen/tag_integration_tests
Guard integration tests with a tag
2023-12-22 16:43:41 -08:00
Daniel Hiltgen
697bea6939 Guard integration tests with a tag
This should help CI avoid running the integration test logic in a
container where it's not currently possible.
2023-12-22 16:33:27 -08:00
K0IN
10da41d677
Add Cache flag to api (#1642) 2023-12-22 17:16:20 -05:00
Bruce MacDonald
db356c8519
post-response templating (#1427) 2023-12-22 17:07:05 -05:00
Jeffrey Morgan
b80081022f cache docker builds in build_linux.sh 2023-12-22 16:01:20 -05:00
Matt Williams
790457398a
Merge pull request #1677 from jmorganca/mattw/docrunupdate
update where are models stored q
2023-12-22 09:56:27 -08:00
Matt Williams
511069a2a5 update where are models stored q
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-12-22 09:48:44 -08:00
Matt Williams
5a85070c22
Update readmes, requirements, packagejsons, etc for all examples (#1452)
Most of the examples needed updates of Readmes to show how to run them. Some of the requirements.txt files had extra content that wasn't needed, or missing altogether. Apparently some folks like to run npm start
to run typescript, so a script was added to all typescript examples which
hadn't been done before.

Basically just a lot of cleanup.

Signed-off-by: Matt Williams <m@technovangelist.com>
2023-12-22 09:10:41 -08:00
Matt Williams
291700c92d
Clean up documentation (#1506)
* Clean up documentation

Will probably need to update with PRs for new release.

Signed-off-by: Matt Williams <m@technovangelist.com>

* Correcting to fit in 0.1.15 changes

Signed-off-by: Matt Williams <m@technovangelist.com>

* Update README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* addressing comments

Signed-off-by: Matt Williams <m@technovangelist.com>

* more api cleanup

Signed-off-by: Matt Williams <m@technovangelist.com>

* its llava not llama

Signed-off-by: Matt Williams <m@technovangelist.com>

* Update docs/troubleshooting.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Updated hosting to server and documented all env vars

Signed-off-by: Matt Williams <m@technovangelist.com>

* remove last of the cli descriptions

Signed-off-by: Matt Williams <m@technovangelist.com>

* Update README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* update further per conversation with jeff earlier today

Signed-off-by: Matt Williams <m@technovangelist.com>

* cleanup the doc readme

Signed-off-by: Matt Williams <m@technovangelist.com>

* move upgrade to faq

Signed-off-by: Matt Williams <m@technovangelist.com>

* first change

Signed-off-by: Matt Williams <m@technovangelist.com>

* updated

Signed-off-by: Matt Williams <m@technovangelist.com>

* Update docs/faq.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* examples in parent

Signed-off-by: Matt Williams <m@technovangelist.com>

* add exapmle for create model.

Signed-off-by: Matt Williams <m@technovangelist.com>

* update faq

Signed-off-by: Matt Williams <m@technovangelist.com>

* update create model api

Signed-off-by: Matt Williams <m@technovangelist.com>

* Update docs/api.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/faq.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/troubleshooting.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* update the readme in docs

Signed-off-by: Matt Williams <m@technovangelist.com>

* update a few more things

Signed-off-by: Matt Williams <m@technovangelist.com>

* Update docs/troubleshooting.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/faq.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update README.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/modelfile.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update docs/troubleshooting.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

---------

Signed-off-by: Matt Williams <m@technovangelist.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-22 09:10:01 -08:00
Daniel Hiltgen
9db28af84e
Merge pull request #1675 from dhiltgen/less_verbose
Quiet down llama.cpp logging by default
2023-12-22 08:57:17 -08:00
Daniel Hiltgen
e5202eb687 Quiet down llama.cpp logging by default
By default builds will now produce non-debug and non-verbose binaries.
To enable verbose logs in llama.cpp and debug symbols in the
native code, set `CGO_CFLAGS=-g`
2023-12-22 08:47:18 -08:00
Daniel Hiltgen
96fb441abd
Merge pull request #1146 from dhiltgen/ext_server_cgo
Add cgo implementation for llama.cpp
2023-12-22 08:16:31 -08:00
Daniel Hiltgen
495c06e4a6 Fix doc glitch 2023-12-21 18:21:31 -08:00
Daniel Hiltgen
fa24e73b82 Remove CPU build, fixup linux build script 2023-12-21 18:21:31 -08:00
Daniel Hiltgen
325d74985b Fix CPU performance on hyperthreaded systems
The default thread count logic was broken and resulted in 2x the number
of threads as it should on a hyperthreading CPU
resulting in thrashing and poor performance.
2023-12-21 16:23:36 -08:00
Bruce MacDonald
fabf2f3467
allow for starting llava queries with filepath (#1549) 2023-12-21 13:20:59 -05:00
Daniel Hiltgen
d9cd3d9667 Revive windows build
The windows native setup still needs some more work, but this gets it building
again and if you set the PATH properly, you can run the resulting exe on a cuda system.
2023-12-20 17:21:54 -08:00
Patrick Devine
a607d922f0
add FAQ for slow networking in WSL2 (#1646) 2023-12-20 16:27:24 -08:00
Daniel Hiltgen
7555ea44f8 Revamp the dynamic library shim
This switches the default llama.cpp to be CPU based, and builds the GPU variants
as dynamically loaded libraries which we can select at runtime.

This also bumps the ROCm library to version 6 given 5.7 builds don't work
on the latest ROCm library that just shipped.
2023-12-20 14:45:57 -08:00
Jeffrey Morgan
df06812494
Update api.md 2023-12-20 08:47:53 -05:00
Daniel Hiltgen
1d1eb1688c Additional nvidial-ml path to check 2023-12-19 15:52:34 -08:00
Michael Yang
23dc179350
Merge pull request #1619 from jmorganca/mxyng/fix-version-test
fix(test): use real version string for comparison
2023-12-19 15:48:52 -08:00
Michael Yang
63aac0edc5 fix(test): use real version string for comparison 2023-12-19 15:03:02 -08:00
Daniel Hiltgen
6558f94ed0 Fix darwin intel build 2023-12-19 13:32:24 -08:00
Erick Ghaumez
1ca484f67e
Add Langchain Dart library (#1564)
* Add Langchain Dart

* Update README.md

---------

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-12-19 14:04:52 -05:00
Jeffrey Morgan
72b0c32fe9
Update README.md 2023-12-19 12:59:22 -05:00
Jeffrey Morgan
68c28224f8
Update README.md 2023-12-19 12:59:03 -05:00
Daniel Hiltgen
54dbfa4c4a Carry ggml-metal.metal as payload 2023-12-19 09:05:46 -08:00
Daniel Hiltgen
5646826a79 Add WSL2 path to nvidia-ml.so library 2023-12-19 09:05:46 -08:00
Daniel Hiltgen
3269535a4c Refine handling of shim presence
This allows the CPU only builds to work on systems with Radeon cards
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
1b991d0ba9 Refine build to support CPU only
If someone checks out the ollama repo and doesn't install the CUDA
library, this will ensure they can build a CPU only version
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
51082535e1 Add automated test for multimodal
A simple test case that verifies llava:7b can read text in an image
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
9adca7f711 Bump llama.cpp to b1662 and set n_parallel=1 2023-12-19 09:05:46 -08:00
Daniel Hiltgen
89bbaafa64 Build linux using ubuntu 20.04
This changes the container-based linux build to use an older Ubuntu
distro to improve our compatibility matrix for older user machines
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
35934b2e05 Adapted rocm support to cgo based llama.cpp 2023-12-19 09:05:46 -08:00
65a
f8ef4439e9 Use build tags to generate accelerated binaries for CUDA and ROCm on Linux.
The build tags rocm or cuda must be specified to both go generate and go build.
ROCm builds should have both ROCM_PATH set (and the ROCM SDK present) as well
as CLBlast installed (for GGML) and CLBlast_DIR set in the environment to the
CLBlast cmake directory (likely /usr/lib/cmake/CLBlast). Build tags are also
used to switch VRAM detection between cuda and rocm implementations, using
added "accelerator_foo.go" files which contain architecture specific functions
and variables. accelerator_none is used when no tags are set, and a helper
function addRunner will ignore it if it is the chosen accelerator. Fix go
generate commands, thanks @deadmeu for testing.
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
d4cd695759 Add cgo implementation for llama.cpp
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Bruce MacDonald
5e7fd6906f Update images.go 2023-12-19 09:05:46 -08:00
Bruce MacDonald
811b1f03c8 deprecate ggml
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00
Matt Williams
ed195f3562
Merge pull request #1595 from pgibler/main
Added cmdh to community section in README
2023-12-18 20:55:18 -08:00
Matt Williams
e0d0072ef1
Merge pull request #1592 from jmorganca/mattw/examplepruning
Lets get rid of these old modelfile examples
2023-12-18 20:29:48 -08:00
pgibler
620a2ffcfb Added cmdh to community section in README 2023-12-18 22:04:40 -05:00
Matt Williams
d287013f24 Lets get rid of these old modelfile examples
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-12-18 17:47:33 -08:00
Jeffrey Morgan
6b5bdfa6c9 update runner submodule 2023-12-18 17:33:46 -05:00
Jeffrey Morgan
c063ee4af0 update runner submodule to fix hipblas build 2023-12-18 15:41:13 -05:00