Patrick Devine
ce8ce82567
add mixtral 8x7b model conversion ( #3859 )
2024-04-23 20:17:04 -07:00
Blake Mizerany
4dc4f1be34
types/model: restrict digest hash part to a minimum of 2 characters ( #3858 )
...
This allows users of a valid Digest to know it has a minimum of 2
characters in the hash part for use when sharding.
This is a reasonable restriction as the hash part is a SHA256 hash which
is 64 characters long, which is the common hash used. There is no
anticipation of using a hash with less than 2 characters.
Also, add MustParseDigest.
Also, replace Digest.Type with Digest.Split for getting both the type
and hash parts together, which is most the common case when asking for
either.
2024-04-23 18:24:17 -07:00
Daniel Hiltgen
16b52331a4
Merge pull request #3857 from dhiltgen/mem_escape_valve
...
Add back memory escape valve
2024-04-23 17:32:24 -07:00
Daniel Hiltgen
5445aaa94e
Add back memory escape valve
...
If we get our predictions wrong, this can be used to
set a lower memory limit as a workaround. Recent multi-gpu
refactoring accidentally removed it, so this adds it back.
2024-04-23 17:09:02 -07:00
Daniel Hiltgen
2ac3dd6853
Merge pull request #3850 from dhiltgen/windows_packaging
...
Move nested payloads to installer and zip file on windows
2024-04-23 16:35:20 -07:00
Daniel Hiltgen
d8851cb7a0
Harden sched TestLoad
...
Give the go routine a moment to deliver the expired event
2024-04-23 16:14:47 -07:00
Daniel Hiltgen
058f6cd2cc
Move nested payloads to installer and zip file on windows
...
Now that the llm runner is an executable and not just a dll, more users are facing
problems with security policy configurations on windows that prevent users
writing to directories and then executing binaries from the same location.
This change removes payloads from the main executable on windows and shifts them
over to be packaged in the installer and discovered based on the executables location.
This also adds a new zip file for people who want to "roll their own" installation model.
2024-04-23 16:14:47 -07:00
Daniel Hiltgen
790cf34d17
Merge pull request #3846 from dhiltgen/missing_runner
...
Detect and recover if runner removed
2024-04-23 13:14:12 -07:00
Michael
928d844896
adding phi-3 mini to readme
...
adding phi-3 mini to readme
2024-04-23 13:58:31 -04:00
Daniel Hiltgen
939d6a8606
Make CI lint verbvose
2024-04-23 10:17:42 -07:00
Daniel Hiltgen
58888a74bc
Detect and recover if runner removed
...
Tmp cleaners can nuke the file out from underneath us. This detects the missing
runner, and re-initializes the payloads.
2024-04-23 10:05:26 -07:00
Daniel Hiltgen
cc5a71e0e3
Merge pull request #3709 from remy415/custom-gpu-defs
...
Adds support for customizing GPU build flags in llama.cpp
2024-04-23 09:28:34 -07:00
Michael Yang
e83bcf7f9a
Merge pull request #3836 from ollama/mxyng/mixtral
...
fix: mixtral graph
2024-04-23 09:15:10 -07:00
Daniel Hiltgen
5690e5ce99
Merge pull request #3418 from dhiltgen/concurrency
...
Request and model concurrency
2024-04-23 08:31:38 -07:00
Daniel Hiltgen
f2ea8470e5
Local unicode test case
2024-04-22 19:29:12 -07:00
Daniel Hiltgen
34b9db5afc
Request and model concurrency
...
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
Daniel Hiltgen
ee448deaba
Merge pull request #3835 from dhiltgen/harden_llm_override
...
Trim spaces and quotes from llm lib override
2024-04-22 19:06:54 -07:00
Bruce MacDonald
6e8db04716
tidy community integrations
...
- move some popular integrations to the top of the lists
2024-04-22 17:29:08 -07:00
Bruce MacDonald
658e60cf73
Revert "stop running model on interactive exit"
...
This reverts commit fad00a85e5
.
2024-04-22 17:23:11 -07:00
Bruce MacDonald
4c78f028f8
Merge branch 'main' of https://github.com/ollama/ollama
2024-04-22 17:22:28 -07:00
Michael Yang
435cc866a3
fix: mixtral graph
2024-04-22 17:19:44 -07:00
Hao Wu
c7d3a558f6
docs: update README to add chat (web UI) for LLM ( #3810 )
...
* add chat (web UI) for LLM
I have used chat with llama3 in local successfully and the code is MIT licensed.
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2024-04-22 20:19:39 -04:00
Maple Gao
089cdb2877
docs: Update README for Lobe-chat integration. ( #3817 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2024-04-22 20:18:15 -04:00
Võ Đình Đạt
ea1e9aa36b
Update README.md ( #3655 )
2024-04-22 20:16:55 -04:00
Jonathan Smoley
d0d28ef90d
Update README.md with Discord-Ollama project ( #3633 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2024-04-22 20:14:20 -04:00
Eric Curtin
6654186a7c
Add podman-ollama to terminal apps ( #3626 )
...
The goal of podman-ollama is to make AI even more boring.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2024-04-22 20:13:23 -04:00
Daniel Hiltgen
aa72281eae
Trim spaces and quotes from llm lib override
2024-04-22 17:11:14 -07:00
reid41
74bcbf828f
add qa-pilot link ( #3612 )
...
* add qa-pilot link
* format the link
* add shell-pilot
2024-04-22 20:10:34 -04:00
Christian Neff
fe39147e64
Add Chatbot UI v2 to Community Integrations ( #3503 )
2024-04-22 20:09:55 -04:00
Bruce MacDonald
fad00a85e5
stop running model on interactive exit
2024-04-22 16:22:14 -07:00
Jeremy
9c0db4cc83
Update gen_windows.ps1
...
Fixed improper env references
2024-04-21 16:13:41 -04:00
Cheng
62be2050dd
chore: use errors.New to replace fmt.Errorf will much better ( #3789 )
2024-04-20 22:11:06 -04:00
Blake Mizerany
56f8aa6912
types/model: export IsValidNamePart ( #3788 )
2024-04-20 18:26:34 -07:00
Sri Siddhaarth
e6f9bfc0e8
Update api.md ( #3705 )
2024-04-20 15:17:03 -04:00
Jeremy
6f18297b3a
Update gen_windows.ps1
...
Forgot a " on the write-host
2024-04-18 19:47:44 -04:00
Jeremy
15016413de
Update gen_windows.ps1
...
Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS to customize GPU builds on Windows
2024-04-18 19:27:16 -04:00
Jeremy
440b7190ed
Update gen_linux.sh
...
Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS instead of OLLAMA_CUSTOM_GPU_DEFS
2024-04-18 19:18:10 -04:00
Daniel Hiltgen
8d1995c625
Merge pull request #3708 from remy415/arm64static
...
move Ollama static build to its own flag
2024-04-18 16:04:12 -07:00
Daniel Hiltgen
fd01fbf038
Merge pull request #3710 from remy415/update-jetson-docs
...
update jetson tutorial
2024-04-18 16:02:08 -07:00
Blake Mizerany
0408205c1c
types/model: accept former :
as a separator in digest ( #3724 )
...
This also converges the old sep `:` to the new sep `-`.
2024-04-18 14:17:46 -07:00
Jeffrey Morgan
63a7edd771
Update README.md
2024-04-18 16:09:38 -04:00
Michael
554ffdcce3
add llama3 to readme
...
add llama3 to readme
2024-04-18 15:18:48 -04:00
Jeremy
9850a4ce08
Merge branch 'ollama:main' into update-jetson-docs
2024-04-18 09:55:17 -04:00
Jeremy
3934c15895
Merge branch 'ollama:main' into custom-gpu-defs
2024-04-18 09:55:10 -04:00
Jeremy
fd048f1367
Merge branch 'ollama:main' into arm64static
2024-04-18 09:55:04 -04:00
Michael Yang
8645076a71
Merge pull request #3712 from ollama/mxyng/mem
...
add stablelm graph calculation
2024-04-17 15:57:51 -07:00
Michael Yang
05e9424824
Merge pull request #3664 from ollama/mxyng/fix-padding-2
...
fix padding to only return padding
2024-04-17 15:57:40 -07:00
Michael Yang
52ebe67a98
Merge pull request #3714 from ollama/mxyng/model-name-host
...
types/model: support : in PartHost for host:port
2024-04-17 15:34:03 -07:00
Michael Yang
889b31ab78
types/model: support : in PartHost for host:port
2024-04-17 15:16:07 -07:00
Michael Yang
3cf483fe48
add stablelm graph calculation
2024-04-17 13:57:19 -07:00