ollama

Author	SHA1	Message	Date
湛露先生	eaaf5d309d	cmd: delete duplicated call to sb.Reset() (#7308 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2024-11-21 11:20:48 -08:00
Jeffrey Morgan	27d9c749d5	docs: remove tutorials, add cloud section to community integrations (#7784 )	2024-11-21 09:59:53 -08:00
R0CKSTAR	b7bddeebc1	env.sh: cleanup unused RELEASE_IMAGE_REPO (#6855 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-11-21 08:28:04 -08:00
Paul Robello	6a0c2ec50f	readme: add terminal tool ParLlama to community integrations (#5623 )	2024-11-21 02:55:35 -08:00
毛巳煜	baa41be2aa	readme: add a community made ollama web management tool (#7126 )	2024-11-21 02:51:45 -08:00
xuyangbocn	2157b1232e	readme: add Terraform AWS Ollama & Open WebUI community example (#5633 )	2024-11-21 02:28:57 -08:00
emrgnt-cmplxty	37711578a2	readme: add R2R to community integrations (#5587 )	2024-11-21 02:09:36 -08:00
Cyril Blaecke	fb2c9594e0	readme: Add Nosia to Community Integrations (#5381 )	2024-11-21 02:07:17 -08:00
Christian Tzolov	7fbcd55da3	readme: Add Spring AI library reference (#5981 )	2024-11-21 02:02:14 -08:00
Philippe Charrière	b4348bdd25	readme: add Parakeet to community integrations Parakeet is a GoLang SDK for Ollama --------- Co-authored-by: Parth Sareen <parth.sareen@ollama.com>	2024-11-21 02:00:32 -08:00
Marcin Szczygliński	155734e09a	readme: add community integration py-gpt (#6503 )	2024-11-21 01:54:39 -08:00
Michael	883d80e097	readme: add Promptery to community integrations (#7093 )	2024-11-21 01:46:20 -08:00
Jakub Burkiewicz	e4c9f75b23	readme: add node-red-contrib-ollama to community integrations (#4648 )	2024-11-21 01:09:37 -08:00
Dezoito	f5ec7cc872	readme: add ollama grid search, a community project (#4301 )	2024-11-21 01:02:46 -08:00
Franco Lombardo	811bafba82	readme: Add LLPhant to community integrations (#5679 )	2024-11-21 00:54:26 -08:00
Aarushi	431075fcbb	readme: add autogpt integration to list of community integrations (#6459 )	2024-11-21 00:51:38 -08:00
Kevin Brake	c4f27225ac	readme: add community contribution to readme ollama-kis (#5575 )	2024-11-21 00:31:27 -08:00
chyok	b7aa5ee06c	readme: Add tkinter-based client to community based integrations (#5412 )	2024-11-21 00:19:24 -08:00
Nico	3f87f71755	readme: add Shinkai Desktop to community integrations (#4877 )	2024-11-21 00:16:18 -08:00
Laurent Eschenauer	20623cec13	readme: add OpenGPA to community integrations (#5497 )	2024-11-21 00:13:54 -08:00
Andy Gill	0e5f31a86d	readme: add Haverscript to community integrations (#6945 ) Haverscript uses classical functional programming techniques to provide a composable interface for interacting with ollama-hosted LLMs.	2024-11-21 00:11:39 -08:00
drunkwcodes	7e92091751	readme: Terminal app bb7 to community integrations (#7064 )	2024-11-21 00:03:11 -08:00
boessu	1a742f54c9	readme: update AMD ROCm links (#7213 )	2024-11-20 23:48:55 -08:00
奶茶叔叔	6a89dcf848	readme: flutter-based chat app to community integrations (#7221 )	2024-11-20 23:30:10 -08:00
Alexander F. Rødseth	c5e238e8e5	readme: orbiton to community integrations (#7770 )	2024-11-20 23:24:05 -08:00
Nikita Ganzikov	fce30f407a	app: typo in wintray messages const (#7705 )	2024-11-20 22:01:58 -08:00
Daniel Hiltgen	d863298210	docs: Link to AMD guide on multi-GPU guidance (#7744 )	2024-11-20 16:00:46 -08:00
Jesse Gross	c4b34f2a2a	runner.go: Truncate inputs that exceed context rather than shifting Previous versions of the runner would truncate inputs to the context window before beginning processing. The main processing loop relied on this behavior if the context needed to be shifted later (due to token generation). If truncation did not occur then invariants would be broken, causing crashes or infinite loops. Later versions attempted to fix these bugs and make the logic less subtle so that all inputs could be handled. Truncation was removed to make things consistent. However, truncation is much faster than processing and shifting, so removing it caused performance problems when the input vastly exceeded the context size. This restores the input truncation as a performance optimization while keeping the more robust processing logic. Fixes #7762	2024-11-20 12:49:24 -08:00
Jesse Gross	c3ff916431	runner.go: Don't add inputs to cache view until actually processed We need to track which tokens are in the cache ourselves. We currently add tokens to the cache tracker when we add them to batch but they are not actually in the cache until we call Decode. This can cause confusion when we are shifting the cache. Avoids "could not find a KV slot for the batch" issues. Bug #7545	2024-11-20 12:49:24 -08:00
Jesse Gross	3fc1dc0e6f	runner.go: Hard fail on errors rather than potentially infinite looping We try to recover from errors by dropping the tokens that caused the problem and re-trying. However, dropping the tokens is not correct and continuing often leads to infinite loops. To avoid, this we end the sequence if such a condition is detected, which is also surprising. At this point, it is better to just report the error. This will make it easier to find problems and the alternatives are perhaps even more surprising to users. This is not a very satisfactory solution either - we should isolate the error and return it to the user without killing the whole process. However, this is an incremental step and consistent with most other failures (which either manifest as abort() or panic).	2024-11-20 12:49:24 -08:00
Jesse Gross	7121dfa309	runner.go: Retry decoding after defragmentation if needed Fragmentation of the KV cache can occur due to cache shifting or different sequences getting processed. Decode uses a heuristic to decide if it should defrag. However, this heuristic isn't 100% accurate, so decoding can sometimes fail by surprise. For these cases, if decode indicates that there is no KV cache space, we should defrag and then try again.	2024-11-20 12:49:24 -08:00
Jesse Gross	5f68fcab12	runner.go: Use correct index when retrieving embedding results This doesn't have any impact currently because NUM_PARALLEL is forced to 1 for embeddings, so both indicies will always be 0.	2024-11-20 12:49:24 -08:00
Emir Sahin	ecf41eed05	readme: add llm-axe to community integrations (#5931 )	2024-11-20 10:53:14 -08:00
Marcus Ziadé	b8c66d3307	readme: add a swift community integration (#7383 )	2024-11-20 10:49:15 -08:00
thewh1teagle	303f4bc79e	readme: add vibe app to community integrations (#7607 )	2024-11-20 10:45:10 -08:00
Adarsh Mishra	d2a25206b1	readme: add opentalkgpt to community integrations (#7707 )	2024-11-20 10:42:55 -08:00
rohitanshu	2f0a8c8778	docs: fix minor typo in import.md (#7764 ) change 'containg' to 'containing'	2024-11-20 09:57:32 -08:00
Gordon Kamer	bfd30f4286	readme: add Abbey to community integrations (#7746 )	2024-11-19 21:37:15 -08:00
Jonathan Hecl	0ef17ede89	readme: add Gollama to community integrations (#7756 )	2024-11-19 21:31:43 -08:00
Daniel Hiltgen	909a88c5c0	Improve crash reporting (#7728 ) Many model crashes are masked behind "An existing connection was forcibly closed by the remote host" This captures that common error message and wires in any detected errors from the log. This also adds the deepseek context shift error to the known errors we capture.	2024-11-19 16:26:57 -08:00
Daniel Hiltgen	f602ab4de4	expose underlying error on embedding failure (#7743 ) Avoid a round-trip asking users for logs to see what went wrong.	2024-11-19 16:26:05 -08:00
Gabe Goodhart	807ace5b1f	fix(runner): Set logits to 0 if false on Batch.Add https://github.com/ollama/ollama/issues/7656 Branch: Granite3StoppingBug-7656 Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2024-11-19 15:45:37 -08:00
Blake Mizerany	4b8a2e341a	server: allow mixed-case model names on push, pull, cp, and create (#7676 ) This change allows for mixed-case model names to be pushed, pulled, copied, and created, which was previously disallowed because the Ollama registry was backed by a Docker registry that enforced a naming convention that disallowed mixed-case names, which is no longer the case. This does not break existing, intended, behaviors. Also, make TestCase test a story of creating, updating, pulling, and copying a model with case variations, ensuring the model's manifest is updated correctly, and not duplicated across different files with different case variations.	2024-11-19 15:05:57 -08:00
frob	e66c29261a	Better error suppresion when getting terminal colours (#7739 ) Co-authored-by: Richard Lyons <frob@cloudstaff.com>	2024-11-19 08:33:52 -08:00
Patrick Devine	712d63c3f0	update the docs (#7731 )	2024-11-18 21:17:38 -08:00
Patrick Sy	6cdf27d154	readme: add Alfred Ollama to community integrations (#7724 )	2024-11-18 19:33:23 -08:00
frob	5c18e66384	Notify the user if systemd is not running (#6693 ) Co-authored-by: Richard Lyons <frob@cloudstaff.com>	2024-11-18 15:02:41 -08:00
Daniel Hiltgen	35096a7eff	win: add right click menu support (#7727 ) Enable both left and right click on the pop-up menu	2024-11-18 14:39:52 -08:00
Daniel Hiltgen	81d55d3e4d	fix index out of range on zero layer metal load (#7696 ) If the model doesn't fit any layers on metal, and we load zero layers we would panic trying to look up the GPU size during scheduling ops	2024-11-18 11:48:13 -08:00
Vinh Nguyen	a14f76491d	readme: improve Community Integrations section (#7718 )	2024-11-17 19:30:22 -08:00

1 2 3 4 5 ...

3662 commits