Commit graph

3477 commits

Author SHA1 Message Date
Daniel Hiltgen
cd5c8f6471
Optimize container images for startup (#6547)
* Optimize container images for startup

This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.

* Refactor payload logic and add buildx support for faster builds

* Move payloads around

* Review comments

* Converge to buildx based helper scripts

* Use docker buildx action for release
2024-09-12 12:10:30 -07:00
dcasota
fef257c5c5
examples: updated requirements.txt for privategpt example 2024-09-11 18:56:56 -07:00
Adrian Cole
d066d9b8e0
examples: polish loganalyzer example (#6744) 2024-09-11 18:37:37 -07:00
RAPID ARCHITECT
5a00dc9fc9
readme: add ollama_moe to community integrations (#6752) 2024-09-11 18:36:26 -07:00
Jesse Gross
c354e87809
Merge pull request #6767 from ollama/jessegross/bug_6707
runner: Flush pending responses before returning
2024-09-11 17:20:22 -07:00
Jesse Gross
93ac3760cb runner: Flush pending responses before returning
If there are any pending reponses (such as from potential stop
tokens) then we should send them back before ending the sequence.
Otherwise, we can be missing tokens at the end of a response.

Fixes #6707
2024-09-11 16:39:32 -07:00
Patrick Devine
abed273de3
add "stop" command (#6739) 2024-09-11 16:36:21 -07:00
Michael Yang
034392624c
Merge pull request #6762 from ollama/mxyng/show-output
refactor show ouput
2024-09-11 14:58:40 -07:00
Michael Yang
ecab6f1cc5 refactor show ouput
fixes line wrapping on long texts
2024-09-11 14:23:09 -07:00
Petr Mironychev
7d6900827d
readme: add QodeAssist to community integrations (#6754) 2024-09-11 13:19:49 -07:00
Daniel Hiltgen
9246e6dd15
Verify permissions for AMD GPU (#6736)
This adds back a check which was lost many releases back to verify /dev/kfd permissions
which when lacking, can lead to confusing failure modes of:
  "rocBLAS error: Could not initialize Tensile host: No devices found"

This implementation does not hard fail the serve command but instead will fall back to CPU
with an error log.  In the future we can include this in the GPU discovery UX to show
detected but unsupported devices we discovered.
2024-09-11 11:38:25 -07:00
Michael Yang
735a0ca2e4
Merge pull request #6732 from ollama/mxyng/debug-proxy
add *_proxy to env map for debugging
2024-09-10 16:13:25 -07:00
Michael Yang
dddb72e084 add *_proxy for debugging 2024-09-10 09:43:35 -07:00
Jeffrey Morgan
83a9b5271a
docs: update examples to use llama3.1 (#6718) 2024-09-09 22:47:16 -07:00
Daniel Hiltgen
4a8069f9c4
Quiet down dockers new lint warnings (#6716)
* Quiet down dockers new lint warnings

Docker has recently added lint warnings to build.  This cleans up those warnings.

* Fix go lint regression
2024-09-09 17:22:20 -07:00
Patrick Devine
84b84ce2db
catch when model vocab size is set correctly (#6714) 2024-09-09 17:18:54 -07:00
Jeffrey Morgan
bb6a086d63
readme: add crewAI to community integrations (#6699) 2024-09-08 00:36:24 -07:00
RAPID ARCHITECT
30c8f201cc
readme: add crewAI with mesop to community integrations 2024-09-08 00:35:59 -07:00
frob
06d4fba851
openai: align chat temperature and frequency_penalty options with completion (#6688) 2024-09-07 09:08:08 -07:00
Jeffrey Morgan
108fb6c1d1
docs: improve linux install documentation (#6683)
Includes small improvements to document layout and code blocks
2024-09-06 22:05:37 -07:00
Yaroslav
da915345d1
openai: don't scale temperature or frequency_penalty (#6514) 2024-09-06 17:45:45 -07:00
nickthecook
8a027bc401
readme: add Archyve to community integrations (#6680) 2024-09-06 14:06:01 -07:00
imoize
5446903fbd
readme: add Plasmoid Ollama Control to community integrations (#6681) 2024-09-06 14:04:12 -07:00
Daniel Hiltgen
56318fb365
Improve logging on GPU too small (#6666)
When we determine a GPU is too small for any layers, it's not always clear why.
This will help troubleshoot those scenarios.
2024-09-06 08:29:36 -07:00
frob
fe91d7fff1
openai: fix "presence_penalty" typo and add test (#6665) 2024-09-06 01:16:28 -07:00
Patrick Devine
608e87bf87
Fix gemma2 2b conversion (#6645) 2024-09-05 17:02:28 -07:00
Daniel Hiltgen
48685c6ed0
Document uninstall on windows (#6663) 2024-09-05 15:57:38 -07:00
Daniel Hiltgen
9565fa64a8
Revert "Detect running in a container (#6495)" (#6662)
This reverts commit a60d9b89ce.
2024-09-05 14:26:00 -07:00
Daniel Hiltgen
6719097649
llm: make load time stall duration configurable via OLLAMA_LOAD_TIMEOUT
With the new very large parameter models, some users are willing to wait for
a very long time for models to load.
2024-09-05 14:00:08 -07:00
Daniel Hiltgen
b05c9e83d9
Introduce GPU Overhead env var (#5922)
Provide a mechanism for users to set aside an amount of VRAM on each GPU
to make room for other applications they want to start after Ollama, or workaround
memory prediction bugs
2024-09-05 13:46:35 -07:00
Daniel Hiltgen
a60d9b89ce
Detect running in a container (#6495) 2024-09-05 13:24:51 -07:00
Michael Yang
bf612cd608
Merge pull request #6260 from ollama/mxyng/mem
llama3.1 memory
2024-09-05 13:22:08 -07:00
Zeyo
ef98e56122
readme: add AiLama to the list of community integrations (#4957) 2024-09-05 13:10:44 -07:00
Michael
5f944baac7
Update gpu.md: Add RTX 3050 Ti and RTX 3050 Ti (#5888)
* Update gpu.md

    Seems strange that the laptop versions of 3050 and 3050 Ti would be supported but not the non-notebook, but this is what the page (https://developer.nvidia.com/cuda-gpus) says.

Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>

* Update gpu.md

Remove notebook reference

---------

Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>
2024-09-05 11:24:26 -07:00
Tobias Heinze
6fc9d22707
server: fix blob download when receiving a 200 response (#6656) 2024-09-05 10:48:26 -07:00
Vitaly Zdanevich
f27c00d8c5
readme: add Gentoo package manager entry to community integrations (#5714) 2024-09-05 09:58:14 -07:00
王卿
c7c845ec52
Update install.sh:Replace "command -v" with encapsulated functionality (#6035)
Replace "command -v" with encapsulated functionality
2024-09-05 09:49:48 -07:00
Augustinas Malinauskas
cf48603943
readme: include Enchanted for Apple Vision Pro (#4949)
Added Enchanted with Apple Vision Pro support
2024-09-05 01:30:19 -04:00
Silas Marvin
6e67be09b6
readme: add lsp-ai to community integrations (#5063) 2024-09-05 01:17:34 -04:00
Arda Günsüren
0f5f060d2b
readme: add ollama-php library to community integrations (#6361) 2024-09-05 01:01:14 -04:00
jk011ru
b3554778bd
readme: add vnc-lm discord bot community integration (#6644) 2024-09-04 19:46:02 -04:00
Pascal Patry
bbe7b96ded
llm: use json.hpp from common (#6642) 2024-09-04 19:34:42 -04:00
Rune Berg
c18ff18b2c
readme: add confichat to community integrations (#6378) 2024-09-04 17:26:02 -04:00
Tomoya Fujita
133770a548
docs: add group to manual Linux isntructions and verify service is running (#6430) 2024-09-04 14:45:09 -04:00
Teïlo M
f36ebfb478
readme: add gollm to the list of community libraries (#6099) 2024-09-04 14:19:41 -04:00
亢奋猫
5b55379651
readme: add Cherry Studio to community integrations (#6633) 2024-09-04 10:53:36 -04:00
Mitar
93eb43d020
readme: add Go fun package (#6421) 2024-09-04 10:52:46 -04:00
Carter
369479cc30
docs: fix spelling error (#6391)
change "dorrect" to "correct"
2024-09-04 09:42:33 -04:00
Erkin Alp Güney
7d89e48f5c
install.sh: update instructions to use WSL2 (#6450) 2024-09-04 09:34:53 -04:00
Sam
27bcce6d9f
readme: add claude-dev to community integrations (#6630) 2024-09-04 09:32:26 -04:00