Commit graph

2446 commits

Author SHA1 Message Date
Patrick Devine
ce8ce82567
add mixtral 8x7b model conversion (#3859) 2024-04-23 20:17:04 -07:00
Blake Mizerany
4dc4f1be34
types/model: restrict digest hash part to a minimum of 2 characters (#3858)
This allows users of a valid Digest to know it has a minimum of 2
characters in the hash part for use when sharding.

This is a reasonable restriction as the hash part is a SHA256 hash which
is 64 characters long, which is the common hash used. There is no
anticipation of using a hash with less than 2 characters.

Also, add MustParseDigest.

Also, replace Digest.Type with Digest.Split for getting both the type
and hash parts together, which is most the common case when asking for
either.
2024-04-23 18:24:17 -07:00
Daniel Hiltgen
16b52331a4
Merge pull request #3857 from dhiltgen/mem_escape_valve
Add back memory escape valve
2024-04-23 17:32:24 -07:00
Daniel Hiltgen
5445aaa94e Add back memory escape valve
If we get our predictions wrong, this can be used to
set a lower memory limit as a workaround.  Recent multi-gpu
refactoring accidentally removed it, so this adds it back.
2024-04-23 17:09:02 -07:00
Daniel Hiltgen
2ac3dd6853
Merge pull request #3850 from dhiltgen/windows_packaging
Move nested payloads to installer and zip file on windows
2024-04-23 16:35:20 -07:00
Daniel Hiltgen
d8851cb7a0 Harden sched TestLoad
Give the go routine a moment to deliver the expired event
2024-04-23 16:14:47 -07:00
Daniel Hiltgen
058f6cd2cc Move nested payloads to installer and zip file on windows
Now that the llm runner is an executable and not just a dll, more users are facing
problems with security policy configurations on windows that prevent users
writing to directories and then executing binaries from the same location.
This change removes payloads from the main executable on windows and shifts them
over to be packaged in the installer and discovered based on the executables location.
This also adds a new zip file for people who want to "roll their own" installation model.
2024-04-23 16:14:47 -07:00
Daniel Hiltgen
790cf34d17
Merge pull request #3846 from dhiltgen/missing_runner
Detect and recover if runner removed
2024-04-23 13:14:12 -07:00
Michael
928d844896
adding phi-3 mini to readme
adding phi-3 mini to readme
2024-04-23 13:58:31 -04:00
Daniel Hiltgen
939d6a8606 Make CI lint verbvose 2024-04-23 10:17:42 -07:00
Daniel Hiltgen
58888a74bc Detect and recover if runner removed
Tmp cleaners can nuke the file out from underneath us.  This detects the missing
runner, and re-initializes the payloads.
2024-04-23 10:05:26 -07:00
Daniel Hiltgen
cc5a71e0e3
Merge pull request #3709 from remy415/custom-gpu-defs
Adds support for customizing GPU build flags in llama.cpp
2024-04-23 09:28:34 -07:00
Michael Yang
e83bcf7f9a
Merge pull request #3836 from ollama/mxyng/mixtral
fix: mixtral graph
2024-04-23 09:15:10 -07:00
Daniel Hiltgen
5690e5ce99
Merge pull request #3418 from dhiltgen/concurrency
Request and model concurrency
2024-04-23 08:31:38 -07:00
Daniel Hiltgen
f2ea8470e5 Local unicode test case 2024-04-22 19:29:12 -07:00
Daniel Hiltgen
34b9db5afc Request and model concurrency
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
Daniel Hiltgen
ee448deaba
Merge pull request #3835 from dhiltgen/harden_llm_override
Trim spaces and quotes from llm lib override
2024-04-22 19:06:54 -07:00
Bruce MacDonald
6e8db04716 tidy community integrations
- move some popular integrations to the top of the lists
2024-04-22 17:29:08 -07:00
Bruce MacDonald
658e60cf73 Revert "stop running model on interactive exit"
This reverts commit fad00a85e5.
2024-04-22 17:23:11 -07:00
Bruce MacDonald
4c78f028f8 Merge branch 'main' of https://github.com/ollama/ollama 2024-04-22 17:22:28 -07:00
Michael Yang
435cc866a3 fix: mixtral graph 2024-04-22 17:19:44 -07:00
Hao Wu
c7d3a558f6
docs: update README to add chat (web UI) for LLM (#3810)
* add chat (web UI) for LLM

I have used chat with llama3 in local successfully and the code is MIT licensed.

* Update README.md

---------

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2024-04-22 20:19:39 -04:00
Maple Gao
089cdb2877
docs: Update README for Lobe-chat integration. (#3817)
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2024-04-22 20:18:15 -04:00
Võ Đình Đạt
ea1e9aa36b
Update README.md (#3655) 2024-04-22 20:16:55 -04:00
Jonathan Smoley
d0d28ef90d
Update README.md with Discord-Ollama project (#3633)
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2024-04-22 20:14:20 -04:00
Eric Curtin
6654186a7c
Add podman-ollama to terminal apps (#3626)
The goal of podman-ollama is to make AI even more boring.

Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2024-04-22 20:13:23 -04:00
Daniel Hiltgen
aa72281eae Trim spaces and quotes from llm lib override 2024-04-22 17:11:14 -07:00
reid41
74bcbf828f
add qa-pilot link (#3612)
* add qa-pilot link

* format the link

* add shell-pilot
2024-04-22 20:10:34 -04:00
Christian Neff
fe39147e64
Add Chatbot UI v2 to Community Integrations (#3503) 2024-04-22 20:09:55 -04:00
Bruce MacDonald
fad00a85e5 stop running model on interactive exit 2024-04-22 16:22:14 -07:00
Jeremy
9c0db4cc83
Update gen_windows.ps1
Fixed improper env references
2024-04-21 16:13:41 -04:00
Cheng
62be2050dd
chore: use errors.New to replace fmt.Errorf will much better (#3789) 2024-04-20 22:11:06 -04:00
Blake Mizerany
56f8aa6912
types/model: export IsValidNamePart (#3788) 2024-04-20 18:26:34 -07:00
Sri Siddhaarth
e6f9bfc0e8
Update api.md (#3705) 2024-04-20 15:17:03 -04:00
Jeremy
6f18297b3a
Update gen_windows.ps1
Forgot a " on the write-host
2024-04-18 19:47:44 -04:00
Jeremy
15016413de
Update gen_windows.ps1
Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS to customize GPU builds on Windows
2024-04-18 19:27:16 -04:00
Jeremy
440b7190ed
Update gen_linux.sh
Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS instead of OLLAMA_CUSTOM_GPU_DEFS
2024-04-18 19:18:10 -04:00
Daniel Hiltgen
8d1995c625
Merge pull request #3708 from remy415/arm64static
move Ollama static build to its own flag
2024-04-18 16:04:12 -07:00
Daniel Hiltgen
fd01fbf038
Merge pull request #3710 from remy415/update-jetson-docs
update jetson tutorial
2024-04-18 16:02:08 -07:00
Blake Mizerany
0408205c1c
types/model: accept former : as a separator in digest (#3724)
This also converges the old sep `:` to the new sep `-`.
2024-04-18 14:17:46 -07:00
Jeffrey Morgan
63a7edd771
Update README.md 2024-04-18 16:09:38 -04:00
Michael
554ffdcce3
add llama3 to readme
add llama3 to readme
2024-04-18 15:18:48 -04:00
Jeremy
9850a4ce08
Merge branch 'ollama:main' into update-jetson-docs 2024-04-18 09:55:17 -04:00
Jeremy
3934c15895
Merge branch 'ollama:main' into custom-gpu-defs 2024-04-18 09:55:10 -04:00
Jeremy
fd048f1367
Merge branch 'ollama:main' into arm64static 2024-04-18 09:55:04 -04:00
Michael Yang
8645076a71
Merge pull request #3712 from ollama/mxyng/mem
add stablelm graph calculation
2024-04-17 15:57:51 -07:00
Michael Yang
05e9424824
Merge pull request #3664 from ollama/mxyng/fix-padding-2
fix padding to only return padding
2024-04-17 15:57:40 -07:00
Michael Yang
52ebe67a98
Merge pull request #3714 from ollama/mxyng/model-name-host
types/model: support : in PartHost for host:port
2024-04-17 15:34:03 -07:00
Michael Yang
889b31ab78 types/model: support : in PartHost for host:port 2024-04-17 15:16:07 -07:00
Michael Yang
3cf483fe48 add stablelm graph calculation 2024-04-17 13:57:19 -07:00