ollama

Author	SHA1	Message	Date
Daniel Hiltgen	35934b2e05	Adapted rocm support to cgo based llama.cpp	2023-12-19 09:05:46 -08:00
65a	f8ef4439e9	Use build tags to generate accelerated binaries for CUDA and ROCm on Linux. The build tags rocm or cuda must be specified to both go generate and go build. ROCm builds should have both ROCM_PATH set (and the ROCM SDK present) as well as CLBlast installed (for GGML) and CLBlast_DIR set in the environment to the CLBlast cmake directory (likely /usr/lib/cmake/CLBlast). Build tags are also used to switch VRAM detection between cuda and rocm implementations, using added "accelerator_foo.go" files which contain architecture specific functions and variables. accelerator_none is used when no tags are set, and a helper function addRunner will ignore it if it is the chosen accelerator. Fix go generate commands, thanks @deadmeu for testing.	2023-12-19 09:05:46 -08:00
Daniel Hiltgen	d4cd695759	Add cgo implementation for llama.cpp Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.	2023-12-19 09:05:46 -08:00
Bruce MacDonald	5e7fd6906f	Update images.go	2023-12-19 09:05:46 -08:00
Bruce MacDonald	811b1f03c8	deprecate ggml - remove ggml runner - automatically pull gguf models when ggml detected - tell users to update to gguf in the case automatic pull fails Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>	2023-12-19 09:05:46 -08:00
Matt Williams	ed195f3562	Merge pull request #1595 from pgibler/main Added cmdh to community section in README	2023-12-18 20:55:18 -08:00
Matt Williams	e0d0072ef1	Merge pull request #1592 from jmorganca/mattw/examplepruning Lets get rid of these old modelfile examples	2023-12-18 20:29:48 -08:00
pgibler	620a2ffcfb	Added cmdh to community section in README	2023-12-18 22:04:40 -05:00
Matt Williams	d287013f24	Lets get rid of these old modelfile examples Signed-off-by: Matt Williams <m@technovangelist.com>	2023-12-18 17:47:33 -08:00
Jeffrey Morgan	6b5bdfa6c9	update runner submodule	2023-12-18 17:33:46 -05:00
Jeffrey Morgan	c063ee4af0	update runner submodule to fix hipblas build	2023-12-18 15:41:13 -05:00
Bruce MacDonald	d99fa6ce0a	send empty messages on last chat response (#1530 )	2023-12-18 14:23:38 -05:00
Patrick Devine	3948c6ea06	add magic header for unit tests (#1558 )	2023-12-18 10:41:02 -08:00
Jeffrey Morgan	b85982eb91	update runner submodule	2023-12-18 12:43:31 -05:00
Patrick Devine	86b0dd4b16	add API create/copy handlers (#1541 )	2023-12-15 11:59:18 -08:00
Augustinas Malinauskas	f728738427	README with Enchanted iOS App (#1529 ) * feat(docs): README with Enchanted iOS app * Update README.md --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2023-12-15 14:37:29 -05:00
Ian Purton	115048a0d8	Added Bionic GPT as a front end. (#1463 ) * Added Bionic GPT as a front end. * Update README.md --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2023-12-15 14:33:04 -05:00
Bruce MacDonald	1b417a7836	use exp slices for go 1.20 compatibility (#1544 )	2023-12-15 14:15:56 -05:00
Patrick Devine	0174665d0e	add API tests for list handler (#1535 )	2023-12-14 18:18:25 -08:00
Patrick Devine	630518f0d9	Add unit test of API routes (#1528 )	2023-12-14 16:47:40 -08:00
Bruce MacDonald	6e16098a60	remove sample_count from docs (#1527 ) this info has not been returned from these endpoints in some time	2023-12-14 17:49:00 -05:00
Bruce MacDonald	6ee8c80199	restore model load duration on generate response (#1524 ) * restore model load duration on generate response - set model load duration on generate and chat done response - calculate createAt time when response created * remove checkpoints predict opts * Update routes.go	2023-12-14 12:15:50 -05:00
Jeffrey Morgan	31f0551dab	Update runner to support mixtral and mixture of experts (MoE) (#1475 )	2023-12-13 17:15:10 -05:00
Jeffrey Morgan	4a1abfe4fa	fix tests	2023-12-13 14:42:30 -05:00
Jeffrey Morgan	bbd41494bf	add multimodal to `README.md`	2023-12-13 14:38:47 -05:00
Jeffrey Morgan	fedba24a63	Docs for multimodal support (#1485 ) * add multimodal docs * add chat api docs * consistency between `/api/generate` and `/api/chat` * simplify docs	2023-12-13 13:59:33 -05:00
pepperoni21	e3b090dbc5	Added message format for chat api (#1488 )	2023-12-13 11:21:23 -05:00
Patrick Devine	d9e60f634b	add image support to the chat api (#1490 )	2023-12-12 13:28:58 -08:00
Michael Yang	4251b342de	Merge pull request #1469 from jmorganca/mxyng/model-types remove per-model types	2023-12-12 12:27:03 -08:00
Jeffrey Morgan	0a9d348023	Fix issues with `/set template` and `/set system` (#1486 )	2023-12-12 14:43:19 -05:00
Bruce MacDonald	3144e2a439	exponential back-off (#1484 )	2023-12-12 12:33:02 -05:00
Bruce MacDonald	c0960e29b5	retry on concurrent request failure (#1483 ) - remove parallel	2023-12-12 12:14:35 -05:00
ruecat	5314fc9b63	Fix Readme "Database -> MindsDB" link (#1479 )	2023-12-12 10:26:13 -05:00
Jorge Torres	a36b5fef3b	Update README.md (#1412 )	2023-12-11 18:05:10 -05:00
Patrick Devine	910e9401d0	Multimodal support (#1216 ) --------- Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>	2023-12-11 13:56:22 -08:00
Michael Yang	56ffc3023a	remove per-model types mostly replaced by decoding tensors except ggml models which only support llama	2023-12-11 09:40:21 -08:00
Bruce MacDonald	7a1b37ac64	os specific ctrl-z (#1420 )	2023-12-11 10:48:14 -05:00
Jeffrey Morgan	5d4d2e2c60	update docs with chat completion api	2023-12-10 13:53:36 -05:00
Jeffrey Morgan	7db5bcf73b	fix `go-staticcheck` warning	2023-12-10 11:44:27 -05:00
Jeffrey Morgan	fa2f095bd9	fix model name returned by `/api/generate` being different than the model name provided	2023-12-10 11:42:15 -05:00
Jeffrey Morgan	045b855db9	fix error on accumulating final chat response	2023-12-10 11:24:39 -05:00
Jeffrey Morgan	32064a0646	fix empty response when receiving runner error	2023-12-10 10:53:38 -05:00
Jeffrey Morgan	d9a250e9b5	seek to end of file when decoding older model formats	2023-12-09 21:14:35 -05:00
Jeffrey Morgan	944519ed16	seek to eof for older model binaries	2023-12-09 20:48:57 -05:00
Jeffrey Morgan	2dd040d04c	do not use `--parallel 2` for old runners	2023-12-09 20:17:33 -05:00
Bruce MacDonald	bbe41ce41a	fix: parallel queueing race condition caused silent failure (#1445 ) * fix: queued request failures - increase parallel requests to 2 to complete queued request, queueing is managed in ollama * log steam errors	2023-12-09 14:14:02 -05:00
Jeffrey Morgan	9e1406e4ed	Don't expose model information in `/api/generate`	2023-12-09 02:05:43 -08:00
Jeffrey Morgan	b74580c913	Update api.md	2023-12-08 16:02:07 -08:00
Bruce MacDonald	7e9405fd07	fix: encode full previous prompt in context (#1424 )	2023-12-08 16:53:51 -05:00
Bruce MacDonald	3b0b8930d4	fix: only flush template in chat when current role encountered (#1426 )	2023-12-08 16:44:24 -05:00

1 2 3 4 5 ...

1624 commits