Commit graph

363 commits

Author SHA1 Message Date
Daniel Hiltgen
9246e6dd15
Verify permissions for AMD GPU (#6736)
This adds back a check which was lost many releases back to verify /dev/kfd permissions
which when lacking, can lead to confusing failure modes of:
  "rocBLAS error: Could not initialize Tensile host: No devices found"

This implementation does not hard fail the serve command but instead will fall back to CPU
with an error log.  In the future we can include this in the GPU discovery UX to show
detected but unsupported devices we discovered.
2024-09-11 11:38:25 -07:00
Jeffrey Morgan
83a9b5271a
docs: update examples to use llama3.1 (#6718) 2024-09-09 22:47:16 -07:00
Jeffrey Morgan
108fb6c1d1
docs: improve linux install documentation (#6683)
Includes small improvements to document layout and code blocks
2024-09-06 22:05:37 -07:00
Daniel Hiltgen
48685c6ed0
Document uninstall on windows (#6663) 2024-09-05 15:57:38 -07:00
Michael
5f944baac7
Update gpu.md: Add RTX 3050 Ti and RTX 3050 Ti (#5888)
* Update gpu.md

    Seems strange that the laptop versions of 3050 and 3050 Ti would be supported but not the non-notebook, but this is what the page (https://developer.nvidia.com/cuda-gpus) says.

Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>

* Update gpu.md

Remove notebook reference

---------

Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>
2024-09-05 11:24:26 -07:00
Tomoya Fujita
133770a548
docs: add group to manual Linux isntructions and verify service is running (#6430) 2024-09-04 14:45:09 -04:00
SnoopyTlion
741affdfd6
docs: update faq.md for OLLAMA_MODELS env var permissions (#6587) 2024-09-02 15:31:29 -04:00
rayfiyo
1aad838707
docs: update GGUF examples and references (#6577) 2024-08-31 19:34:25 -07:00
Patrick Devine
8e4e509fa4
update the openai docs to explain how to set the context size (#6548) 2024-08-28 17:11:46 -07:00
Patrick Devine
d13c3daa0b
add safetensors to the modelfile docs (#6532) 2024-08-27 14:46:47 -07:00
Patrick Devine
1713eddcd0
Fix import image width (#6528) 2024-08-27 14:19:47 -07:00
Daniel Hiltgen
4e1c4f6e0b
Update manual instructions with discrete ROCm bundle (#6445) 2024-08-27 13:42:28 -07:00
Patrick Devine
1c70a00f71 adjust image sizes 2024-08-27 11:15:25 -07:00
Patrick Devine
ac80010db8
update the import docs (#6104) 2024-08-26 19:57:26 -07:00
Michael Yang
bb362caf88 update faq 2024-08-23 13:37:21 -07:00
Daniel Hiltgen
f9e31da946 Review comments 2024-08-19 10:36:15 -07:00
Daniel Hiltgen
88bb9e3328 Adjust layout to bin+lib/ollama 2024-08-19 09:38:53 -07:00
Bruce MacDonald
eda8a32a09
update chatml template format to latest in docs (#6344) 2024-08-13 16:39:18 -07:00
Pamela Fox
1f32276178
Update openai.md to remove extra checkbox (#6345) 2024-08-13 13:36:05 -07:00
Michael Yang
bd5e432630 update import.md 2024-08-12 15:13:29 -07:00
royjhan
5b3a21b578
add metrics to docs (#6079) 2024-08-07 14:43:44 -07:00
Kyle Kelley
ad0c19dde4
Use llama3.1 in tools example (#5985)
* Use llama3.1 in tools example

* Update api.md
2024-08-07 17:20:50 -04:00
Michael Yang
39f2bc6bfc
Merge pull request #6167 from ollama/mxyng/line-feed
line feed
2024-08-05 00:06:28 -07:00
frob
b73b0940ef
Disable paging for journalctl (#6154)
Users using `journalctl` to get logs for issue logging sometimes don't realize that paging is causing information to be missed.
2024-08-05 00:10:53 -04:00
Michael Yang
6a07344786 line feed 2024-08-04 17:25:41 -07:00
royjhan
4addf6b587
Update OpenAI Compatibility Docs with /v1/completions (#5311)
* Update docs

* token bug corrected

* Update docs/openai.md

* Update docs/openai.md

* add suffix

* merge conflicts

* merge conflicts
2024-08-02 13:16:23 -07:00
royjhan
85c7f11170
Update docs (#5310) 2024-08-02 13:05:57 -07:00
Kim Hallberg
ce1fb4447e
Fix models/{model} URL (#6132) 2024-08-01 16:31:47 -07:00
royjhan
558a54b098
Update OpenAI Compatibility Docs with /v1/embeddings (#5470)
* docs without usage

* no usage

* rm metric note
2024-08-01 16:00:29 -07:00
royjhan
ed52833bb1
Add to docs (#5309) 2024-08-01 15:58:13 -07:00
royjhan
f561eecfb8
Update OpenAI Compatibility Docs with /v1/models (#5151)
* OpenAI Docs

* Update docs/openai.md

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Remove newline

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-08-01 15:48:44 -07:00
Daniel Hiltgen
1a83581a8e
Merge pull request #5895 from dhiltgen/sched_faq
Better explain multi-gpu behavior
2024-07-29 14:25:41 -07:00
Daniel Hiltgen
161e12cecf
Merge pull request #5932 from dhiltgen/win_font
Explain font problems on windows 10
2024-07-29 13:40:24 -07:00
Veit Heller
6f26e9322f
Fix typo in image docs (#6041) 2024-07-29 08:50:53 -07:00
Jeffrey Morgan
0e4d653687
upate to llama3.1 elsewhere in repo (#6032) 2024-07-28 19:56:02 -07:00
Tibor Schmidt
f3d7a481b7
feat: add support for min_p (resolve #1142) (#1825) 2024-07-27 14:37:40 -07:00
Jeffrey Morgan
f5e3939220
Update api.md (#5968) 2024-07-25 23:10:18 -04:00
Jeffrey Morgan
ae27d9dcfd
Update openai.md 2024-07-25 20:27:33 -04:00
Michael Yang
37096790a7
Merge pull request #5552 from ollama/mxyng/messages-docs
docs
2024-07-25 16:26:19 -07:00
Michael Yang
997c903884
Update docs/template.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-25 16:23:40 -07:00
Jeffrey Morgan
455e61170d
Update openai.md 2024-07-25 18:34:47 -04:00
royjhan
4de1370a9d
openai tools doc (#5617) 2024-07-25 18:34:06 -04:00
Daniel Hiltgen
6c2129d5d0 Explain font problems on windows 10 2024-07-24 15:22:00 -07:00
Daniel Hiltgen
830fdd2715 Better explain multi-gpu behavior 2024-07-23 15:16:38 -07:00
Michael Yang
9b60a038e5 update api.md 2024-07-22 13:49:51 -07:00
Michael Yang
83a0cb8d88 docs 2024-07-22 13:38:09 -07:00
royjhan
c0648233f2
api embed docs (#5282) 2024-07-22 13:37:08 -07:00
Daniel Hiltgen
283948c83b Adjust windows ROCm discovery
The v5 hip library returns unsupported GPUs which wont enumerate at
inference time in the runner so this makes sure we align discovery.  The
gfx906 cards are no longer supported so we shouldn't compile with that
GPU type as it wont enumerate at runtime.
2024-07-20 15:17:50 -07:00
royjhan
0d41623b52
OpenAI: Add Suffix to v1/completions (#5611)
* add suffix

* remove todo

* remove TODO

* add to test

* rm outdated prompt tokens info md

* fix test

* fix test
2024-07-16 20:50:14 -07:00
Daniel Hiltgen
1f50356e8e Bump ROCm on windows to 6.1.2
This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.
2024-07-10 11:01:22 -07:00