Daniel Hiltgen
1b991d0ba9
Refine build to support CPU only
...
If someone checks out the ollama repo and doesn't install the CUDA
library, this will ensure they can build a CPU only version
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
51082535e1
Add automated test for multimodal
...
A simple test case that verifies llava:7b can read text in an image
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
9adca7f711
Bump llama.cpp to b1662 and set n_parallel=1
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
89bbaafa64
Build linux using ubuntu 20.04
...
This changes the container-based linux build to use an older Ubuntu
distro to improve our compatibility matrix for older user machines
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
35934b2e05
Adapted rocm support to cgo based llama.cpp
2023-12-19 09:05:46 -08:00
65a
f8ef4439e9
Use build tags to generate accelerated binaries for CUDA and ROCm on Linux.
...
The build tags rocm or cuda must be specified to both go generate and go build.
ROCm builds should have both ROCM_PATH set (and the ROCM SDK present) as well
as CLBlast installed (for GGML) and CLBlast_DIR set in the environment to the
CLBlast cmake directory (likely /usr/lib/cmake/CLBlast). Build tags are also
used to switch VRAM detection between cuda and rocm implementations, using
added "accelerator_foo.go" files which contain architecture specific functions
and variables. accelerator_none is used when no tags are set, and a helper
function addRunner will ignore it if it is the chosen accelerator. Fix go
generate commands, thanks @deadmeu for testing.
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
d4cd695759
Add cgo implementation for llama.cpp
...
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Bruce MacDonald
5e7fd6906f
Update images.go
2023-12-19 09:05:46 -08:00
Bruce MacDonald
811b1f03c8
deprecate ggml
...
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00
Matt Williams
ed195f3562
Merge pull request #1595 from pgibler/main
...
Added cmdh to community section in README
2023-12-18 20:55:18 -08:00
Matt Williams
e0d0072ef1
Merge pull request #1592 from jmorganca/mattw/examplepruning
...
Lets get rid of these old modelfile examples
2023-12-18 20:29:48 -08:00
pgibler
620a2ffcfb
Added cmdh to community section in README
2023-12-18 22:04:40 -05:00
Matt Williams
d287013f24
Lets get rid of these old modelfile examples
...
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-12-18 17:47:33 -08:00
Jeffrey Morgan
6b5bdfa6c9
update runner submodule
2023-12-18 17:33:46 -05:00
Jeffrey Morgan
c063ee4af0
update runner submodule to fix hipblas build
2023-12-18 15:41:13 -05:00
Bruce MacDonald
d99fa6ce0a
send empty messages on last chat response ( #1530 )
2023-12-18 14:23:38 -05:00
Patrick Devine
3948c6ea06
add magic header for unit tests ( #1558 )
2023-12-18 10:41:02 -08:00
Jeffrey Morgan
b85982eb91
update runner submodule
2023-12-18 12:43:31 -05:00
Patrick Devine
86b0dd4b16
add API create/copy handlers ( #1541 )
2023-12-15 11:59:18 -08:00
Augustinas Malinauskas
f728738427
README with Enchanted iOS App ( #1529 )
...
* feat(docs): README with Enchanted iOS app
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-12-15 14:37:29 -05:00
Ian Purton
115048a0d8
Added Bionic GPT as a front end. ( #1463 )
...
* Added Bionic GPT as a front end.
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-12-15 14:33:04 -05:00
Bruce MacDonald
1b417a7836
use exp slices for go 1.20 compatibility ( #1544 )
2023-12-15 14:15:56 -05:00
Patrick Devine
0174665d0e
add API tests for list handler ( #1535 )
2023-12-14 18:18:25 -08:00
Patrick Devine
630518f0d9
Add unit test of API routes ( #1528 )
2023-12-14 16:47:40 -08:00
Bruce MacDonald
6e16098a60
remove sample_count from docs ( #1527 )
...
this info has not been returned from these endpoints in some time
2023-12-14 17:49:00 -05:00
Bruce MacDonald
6ee8c80199
restore model load duration on generate response ( #1524 )
...
* restore model load duration on generate response
- set model load duration on generate and chat done response
- calculate createAt time when response created
* remove checkpoints predict opts
* Update routes.go
2023-12-14 12:15:50 -05:00
Jeffrey Morgan
31f0551dab
Update runner to support mixtral and mixture of experts (MoE) ( #1475 )
2023-12-13 17:15:10 -05:00
Jeffrey Morgan
4a1abfe4fa
fix tests
2023-12-13 14:42:30 -05:00
Jeffrey Morgan
bbd41494bf
add multimodal to README.md
2023-12-13 14:38:47 -05:00
Jeffrey Morgan
fedba24a63
Docs for multimodal support ( #1485 )
...
* add multimodal docs
* add chat api docs
* consistency between `/api/generate` and `/api/chat`
* simplify docs
2023-12-13 13:59:33 -05:00
pepperoni21
e3b090dbc5
Added message format for chat api ( #1488 )
2023-12-13 11:21:23 -05:00
Patrick Devine
d9e60f634b
add image support to the chat api ( #1490 )
2023-12-12 13:28:58 -08:00
Michael Yang
4251b342de
Merge pull request #1469 from jmorganca/mxyng/model-types
...
remove per-model types
2023-12-12 12:27:03 -08:00
Jeffrey Morgan
0a9d348023
Fix issues with /set template
and /set system
( #1486 )
2023-12-12 14:43:19 -05:00
Bruce MacDonald
3144e2a439
exponential back-off ( #1484 )
2023-12-12 12:33:02 -05:00
Bruce MacDonald
c0960e29b5
retry on concurrent request failure ( #1483 )
...
- remove parallel
2023-12-12 12:14:35 -05:00
ruecat
5314fc9b63
Fix Readme "Database -> MindsDB" link ( #1479 )
2023-12-12 10:26:13 -05:00
Jorge Torres
a36b5fef3b
Update README.md ( #1412 )
2023-12-11 18:05:10 -05:00
Patrick Devine
910e9401d0
Multimodal support ( #1216 )
...
---------
Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
Michael Yang
56ffc3023a
remove per-model types
...
mostly replaced by decoding tensors except ggml models which only
support llama
2023-12-11 09:40:21 -08:00
Bruce MacDonald
7a1b37ac64
os specific ctrl-z ( #1420 )
2023-12-11 10:48:14 -05:00
Jeffrey Morgan
5d4d2e2c60
update docs with chat completion api
2023-12-10 13:53:36 -05:00
Jeffrey Morgan
7db5bcf73b
fix go-staticcheck
warning
2023-12-10 11:44:27 -05:00
Jeffrey Morgan
fa2f095bd9
fix model name returned by /api/generate
being different than the model name provided
2023-12-10 11:42:15 -05:00
Jeffrey Morgan
045b855db9
fix error on accumulating final chat response
2023-12-10 11:24:39 -05:00
Jeffrey Morgan
32064a0646
fix empty response when receiving runner error
2023-12-10 10:53:38 -05:00
Jeffrey Morgan
d9a250e9b5
seek to end of file when decoding older model formats
2023-12-09 21:14:35 -05:00
Jeffrey Morgan
944519ed16
seek to eof for older model binaries
2023-12-09 20:48:57 -05:00
Jeffrey Morgan
2dd040d04c
do not use --parallel 2
for old runners
2023-12-09 20:17:33 -05:00
Bruce MacDonald
bbe41ce41a
fix: parallel queueing race condition caused silent failure ( #1445 )
...
* fix: queued request failures
- increase parallel requests to 2 to complete queued request, queueing is managed in ollama
* log steam errors
2023-12-09 14:14:02 -05:00