Commit graph

3527 commits

Author SHA1 Message Date
Viz
491fc312ae
readme: add PyOllaMx project () 2024-09-03 23:10:53 -04:00
Jeffrey Morgan
5e2653f9fe
llm: update llama.cpp commit to 8962422 () 2024-09-03 21:12:39 -04:00
Daniel Hiltgen
f29b167e1a
Use cuda v11 for driver 525 and older ()
It looks like driver 525 (aka, cuda driver 12.0) has problems with the cuda v12 library
we compile against, so run v11 on those older drivers if detected.
2024-09-03 17:15:31 -07:00
Daniel Hiltgen
037a4d103e
Log system memory at info ()
On systems with low system memory, we can hit allocation failures that are difficult to diagnose
without debug logs.  This will make it easier to spot.
2024-09-03 14:55:20 -07:00
Mateusz Migas
50c05d57e0
readme: add Painting Droid community integration () 2024-09-03 16:15:54 -04:00
Amith Koujalgi
35159de18a
readme: update Ollama4j link and add link to Ollama4j Web UI () 2024-09-03 16:08:50 -04:00
FellowTraveler
94fff5805f
Fix sprintf to snprintf ()
/Users/au/src/ollama/llm/ext_server/server.cpp:289:9: warning: 'sprintf' is deprecated: This function is provided for compatibility reasons only. Due to security concerns inherent in the design of sprintf(3), it is highly recommended that you use snprintf(3) instead.
2024-09-03 09:32:59 -07:00
OpenVMP
14d5093cd0
readme: add PartCAD tool to readme for generating 3D CAD models using Ollama () 2024-09-03 12:28:01 -04:00
R0CKSTAR
9df5f0e8e4
Reduce docker image size ()
Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
2024-09-03 09:25:31 -07:00
presbrey
ad3eb00bee
readme: add OllamaFarm project () 2024-09-02 16:05:36 -04:00
Jonathan Hecl
bfc2d61549
readme: add go-crew and Ollamaclient projects () 2024-09-02 15:34:26 -04:00
SnoopyTlion
741affdfd6
docs: update faq.md for OLLAMA_MODELS env var permissions () 2024-09-02 15:31:29 -04:00
Vimal Kumar
5f7b4a5e30
fix(cmd): show info may have nil ModelInfo () 2024-08-31 21:12:17 -07:00
rayfiyo
1aad838707
docs: update GGUF examples and references () 2024-08-31 19:34:25 -07:00
Daniel Hiltgen
a1cef4d0a5
Add findutils to base images ()
This caused missing internal files
2024-08-31 10:40:05 -07:00
Michael Yang
c41f0b9e6c
Merge pull request from ollama/mxyng/build-artifacts
remove any unneeded build artifacts
2024-08-30 09:40:50 -07:00
Michael Yang
142cbb722d
Merge pull request from ollama/mxyng/client-path
passthrough OLLAMA_HOST path to client
2024-08-30 09:40:34 -07:00
Michael Yang
9468c6824a
Merge pull request from ollama/mxyng/messages
update templates to use messages
2024-08-30 09:39:59 -07:00
Michael Yang
11018196e0 remove any unneeded build artifacts 2024-08-29 13:40:47 -07:00
Bryan Honof
56346ccfa3
doc: Add Nix and Flox to package manager listing () 2024-08-29 12:45:35 -04:00
Patrick Devine
8e4e509fa4
update the openai docs to explain how to set the context size () 2024-08-28 17:11:46 -07:00
Michael Yang
47c2b947a9
Merge pull request from ollama/mxyng/fix-test
fix(test): do not clobber models directory
2024-08-28 15:37:47 -07:00
Michael Yang
5eb77bf976
Merge pull request from ollama/mxyng/validate-modelpath
fix: validate modelpath
2024-08-28 14:38:27 -07:00
Michael Yang
e4d0a9c325 fix(test): do not clobber models directory 2024-08-28 14:07:48 -07:00
Patrick Devine
7416ced70f
add llama3.1 chat template () 2024-08-28 14:03:20 -07:00
Michael Yang
9cfd2dd3e3
Merge pull request from ollama/mxyng/detect-chat
detect chat template from configs that contain lists
2024-08-28 11:04:18 -07:00
Michael Yang
8e6da3cbc5 update deprecated warnings 2024-08-28 09:55:11 -07:00
Michael Yang
d9d50c43cc validate model path 2024-08-28 09:32:57 -07:00
Patrick Devine
6c1c1ad6a9
throw an error when encountering unsupport tensor sizes () 2024-08-27 17:54:04 -07:00
Daniel Hiltgen
93ea9240ae
Move ollama executable out of bin dir () 2024-08-27 16:19:00 -07:00
Michael Yang
413ae39f3c update templates to use messages 2024-08-27 15:44:04 -07:00
Michael Yang
60e47573a6 more tokenizer tests 2024-08-27 14:51:10 -07:00
Patrick Devine
d13c3daa0b
add safetensors to the modelfile docs () 2024-08-27 14:46:47 -07:00
Patrick Devine
1713eddcd0
Fix import image width () 2024-08-27 14:19:47 -07:00
Daniel Hiltgen
4e1c4f6e0b
Update manual instructions with discrete ROCm bundle () 2024-08-27 13:42:28 -07:00
Sean Khatiri
397cae7962
llm: fix typo in comment () 2024-08-27 13:28:29 -07:00
Patrick Devine
1c70a00f71 adjust image sizes 2024-08-27 11:15:25 -07:00
Michael Yang
eae3af6807 clean up convert tokenizer 2024-08-27 11:11:43 -07:00
Michael Yang
3eb08377f8 detect chat template from configs that contain lists 2024-08-27 10:49:33 -07:00
Patrick Devine
ac80010db8
update the import docs () 2024-08-26 19:57:26 -07:00
Jeffrey Morgan
47fa0839b9
server: clean up route names for consistency () 2024-08-26 19:36:11 -07:00
Daniel Hiltgen
0f92b19bec
Only enable numa on CPUs ()
The numa flag may be having a performance impact on multi-socket systems with GPU loads
2024-08-24 17:24:50 -07:00
Daniel Hiltgen
69be940bf6
gpu: Group GPU Library sets by variant ()
The recent cuda variant changes uncovered a bug in ByLibrary
which failed to group by common variant for GPU types.
2024-08-23 15:11:56 -07:00
Michael Yang
9638c24c58
Merge pull request from ollama/mxyng/faq
update faq
2024-08-23 14:05:59 -07:00
Michael Yang
bb362caf88 update faq 2024-08-23 13:37:21 -07:00
Michael Yang
386af6c1a0 passthrough OLLAMA_HOST path to client 2024-08-23 13:23:28 -07:00
Patrick Devine
0c819e167b
convert safetensor adapters into GGUF () 2024-08-23 11:29:56 -07:00
Daniel Hiltgen
7a1e1c1caf
gpu: Ensure driver version set before variant ()
During rebasing, the ordering was inverted causing the cuda version
selection logic to break, with driver version being evaluated as zero
incorrectly causing a downgrade to v11.
2024-08-23 11:21:12 -07:00
Daniel Hiltgen
0b03b9c32f
llm: Align cmake define for cuda no peer copy ()
Define changed recently and this slipped through the cracks with the old
name.
2024-08-23 11:20:39 -07:00
Daniel Hiltgen
90ca84172c
Fix embeddings memory corruption ()
* Fix embeddings memory corruption

The patch was leading to a buffer overrun corruption.  Once removed though, parallism
in server.cpp lead to hitting an assert due to slot/seq IDs being >= token count.  To
work around this, only use slot 0 for embeddings.

* Fix embed integration test assumption

The token eval count has changed with recent llama.cpp bumps (0.3.5+)
2024-08-22 14:51:42 -07:00