jmorganca
afa8d6e9d5
patch gemma support
2024-07-30 18:07:29 -07:00
royjhan
1b44d873e7
Add Metrics to api\embed
response ( #5709 )
...
* add prompt tokens to embed response
* rm slog
* metrics
* types
* prompt n
* clean up
* reset submodule
* update tests
* test name
* list metrics
2024-07-30 13:12:21 -07:00
Daniel Hiltgen
cef2c6054d
Merge pull request #5859 from dhiltgen/homogeneous_gpus
...
Prevent partial loading on mixed GPU brands
2024-07-30 11:06:42 -07:00
Daniel Hiltgen
345420998e
Prevent partial loading on mixed GPU brands
...
In mult-brand GPU setups, if we couldn't fully load the model we
would fall through the scheduler and mistakenly try to load across
a mix of brands. This makes sure we find the set of GPU(s) that
best fit for the partial load.
2024-07-30 11:00:55 -07:00
Kim Hallberg
0be8baad2b
Update and Fix example models ( #6065 )
...
* Update example models
* Remove unused README.md
2024-07-29 23:56:37 -07:00
Daniel Hiltgen
1a83581a8e
Merge pull request #5895 from dhiltgen/sched_faq
...
Better explain multi-gpu behavior
2024-07-29 14:25:41 -07:00
Daniel Hiltgen
37926eb991
Merge pull request #5927 from dhiltgen/high_cpu_count
...
Ensure amd gpu nodes are numerically sorted
2024-07-29 14:24:57 -07:00
Daniel Hiltgen
3d4634fdff
Merge pull request #5934 from dhiltgen/missing_cuda_repo
...
Report better error on cuda unsupported os/arch
2024-07-29 14:24:20 -07:00
royjhan
365431d406
return tool calls finish reason for openai ( #5995 )
...
* hot fix
* backend stream support
* clean up
* finish reason
* move to openai
2024-07-29 13:56:57 -07:00
Daniel Hiltgen
161e12cecf
Merge pull request #5932 from dhiltgen/win_font
...
Explain font problems on windows 10
2024-07-29 13:40:24 -07:00
Jeffrey Morgan
46e6327e0f
api: add stringifier for Tool
( #5891 )
2024-07-29 13:35:16 -07:00
Jeffrey Morgan
68ee42f995
update llama.cpp submodule to 6eeaeba1
( #6039 )
2024-07-29 13:20:26 -07:00
Ikko Eltociear Ashimine
f26aef9a8b
docs: update README.md ( #6059 )
...
HuggingFace -> Hugging Face
2024-07-29 10:53:30 -07:00
Michael Yang
38d9036b59
Merge pull request #5992 from ollama/mxyng/save
...
fix: model save
2024-07-29 09:53:19 -07:00
Veit Heller
6f26e9322f
Fix typo in image docs ( #6041 )
2024-07-29 08:50:53 -07:00
Jeffrey Morgan
0e4d653687
upate to llama3.1
elsewhere in repo ( #6032 )
2024-07-28 19:56:02 -07:00
Michael
2c01610616
update readme to llama3.1 ( #5933 )
2024-07-28 14:21:38 -07:00
Tibor Schmidt
f3d7a481b7
feat: add support for min_p ( resolve #1142 ) ( #1825 )
2024-07-27 14:37:40 -07:00
Jeffrey Morgan
f2a96c7d77
llm: keep patch for llama 3 rope factors ( #5987 )
2024-07-26 15:20:52 -07:00
Daniel Hiltgen
e8a66680d1
Merge pull request #5705 from dhiltgen/win_errormode
...
Enable windows error dialog for subprocess
2024-07-26 14:49:34 -07:00
Michael Yang
079b2c3b03
Merge pull request #5999 from ollama/mxyng/fix-push
...
fix nil deref in auth.go
2024-07-26 14:28:34 -07:00
Blake Mizerany
750c1c55f7
server: fix race conditions during download ( #5994 )
...
This fixes various data races scattered throughout the download/pull
client where the client was accessing the download state concurrently.
This commit is mostly a hot-fix and will be replaced by a new client one
day soon.
Also, remove the unnecessary opts argument from downloadChunk.
2024-07-26 14:24:24 -07:00
Michael Yang
a622c47bd3
fix nil deref in auth.go
2024-07-26 14:14:48 -07:00
Michael Yang
ec4c35fe99
Merge pull request #5512 from ollama/mxyng/detect-stop
...
autodetect stop parameters from template
2024-07-26 13:48:23 -07:00
Michael Yang
a250c2cb13
display messages
2024-07-26 13:39:57 -07:00
Michael Yang
3d9de805b7
fix: model save
...
stop parameter is saved as a slice which is incompatible with modelfile
parsing
2024-07-26 13:23:06 -07:00
Michael Yang
15af558423
include modelfile messages
2024-07-26 11:40:11 -07:00
Jeffrey Morgan
f5e3939220
Update api.md ( #5968 )
2024-07-25 23:10:18 -04:00
Jeffrey Morgan
ae27d9dcfd
Update openai.md
2024-07-25 20:27:33 -04:00
Michael Yang
37096790a7
Merge pull request #5552 from ollama/mxyng/messages-docs
...
docs
2024-07-25 16:26:19 -07:00
Michael Yang
997c903884
Update docs/template.md
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-25 16:23:40 -07:00
Blake Mizerany
c8af3c2d96
server: reuse original download URL for images ( #5962 )
...
This changes the registry client to reuse the original download URL
it gets on the first redirect response for all subsequent requests,
preventing thundering herd issues when hot new LLMs are released.
2024-07-25 15:58:30 -07:00
Jeffrey Morgan
455e61170d
Update openai.md
2024-07-25 18:34:47 -04:00
royjhan
4de1370a9d
openai tools doc ( #5617 )
2024-07-25 18:34:06 -04:00
Jeffrey Morgan
bbf8f102ee
Revert "llm(llama): pass rope factors ( #5924 )" ( #5963 )
...
This reverts commit bb46bbcf5e
.
2024-07-25 18:24:55 -04:00
Daniel Hiltgen
ce3c93b08f
Report better error on cuda unsupported os/arch
...
If we detect an NVIDIA GPU, but nvidia doesn't support the os/arch,
this will report a better error for the user and point them to docs
to self-install the drivers if possible.
2024-07-24 17:09:20 -07:00
Daniel Hiltgen
6c2129d5d0
Explain font problems on windows 10
2024-07-24 15:22:00 -07:00
Daniel Hiltgen
7c2a157ca4
Ensure amd gpu nodes are numerically sorted
...
For systems that enumerate over 10 CPUs the default lexicographical
sort order interleaves CPUs and GPUs.
2024-07-24 13:43:26 -07:00
Michael Yang
bb46bbcf5e
llm(llama): pass rope factors ( #5924 )
2024-07-24 16:05:59 -04:00
royjhan
ac33aa7d37
Fix Embed Test Flakes ( #5893 )
...
* float cmp
* increase tolerance
2024-07-24 11:15:46 -07:00
Daniel Hiltgen
830fdd2715
Better explain multi-gpu behavior
2024-07-23 15:16:38 -07:00
Ajay Chintala
a6cd8f6169
Update README.md to add LLMStack integration ( #5799 )
2024-07-23 14:40:23 -04:00
Daniel Hiltgen
c78089263a
Merge pull request #5864 from dhiltgen/bump_go
...
Bump Go patch version
2024-07-22 16:34:18 -07:00
Daniel Hiltgen
3e5ea035d5
Merge pull request #5757 from lreed-mdsol/lreed/bump-go-version-fix-vulnerabilities
...
bump go version to 1.22.5 to fix security vulnerabilities in docker
2024-07-22 16:32:43 -07:00
Daniel Hiltgen
5d604eec5b
Bump Go patch version
2024-07-22 16:16:28 -07:00
Josh
db0968f30c
fix dupe err message ( #5857 )
2024-07-22 15:48:15 -07:00
Daniel Hiltgen
e12fff8810
Enable windows error dialog for subprocess startup
...
Make sure if something goes wrong spawning the process, the user gets
enough info to be able to try to self correct, or at least file a bug
with details so we can fix it. Once the process starts, we immediately
change back to the recommended setting to prevent the blocking dialog.
This ensures if the model fails to load (OOM, unsupported model type,
etc.) the process will exit quickly and we can scan the stdout/stderr
of the subprocess for the reason to report via API.
2024-07-22 14:07:27 -07:00
Michael Yang
9b60a038e5
update api.md
2024-07-22 13:49:51 -07:00
Michael Yang
83a0cb8d88
docs
2024-07-22 13:38:09 -07:00
royjhan
c0648233f2
api embed docs ( #5282 )
2024-07-22 13:37:08 -07:00