Daniel Hiltgen
d8be22e47d
Fix overlapping artifact name on CI
2024-08-19 12:07:18 -07:00
Daniel Hiltgen
652c273f0e
Merge pull request #5049 from dhiltgen/cuda_v12
...
Cuda v12
2024-08-19 11:14:24 -07:00
Daniel Hiltgen
88e7705079
Merge pull request #6402 from rick-github/numParallel
...
Override numParallel in pickBestPartialFitByLibrary() only if unset.
2024-08-19 11:07:22 -07:00
Daniel Hiltgen
f9e31da946
Review comments
2024-08-19 10:36:15 -07:00
Daniel Hiltgen
88bb9e3328
Adjust layout to bin+lib/ollama
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
3b19cdba2a
Remove Jetpack
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
927d98a6cd
Add windows cuda v12 + v11 support
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
f6c811b320
Enable cuda v12 flags
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
4fe3a556fa
Add cuda v12 variant and selection logic
...
Based on compute capability and driver version, pick
v12 or v11 cuda variants.
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
fc3b4cda89
Report GPU variant in log
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
d470ebe78b
Add Jetson cuda variants for arm
...
This adds new variants for arm64 specific to Jetson platforms
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
c7bcb00319
Wire up ccache and pigz in the docker based build
...
This should help speed things up a little
2024-08-19 09:38:53 -07:00
Daniel Hiltgen
74d45f0102
Refactor linux packaging
...
This adjusts linux to follow a similar model to windows with a discrete archive
(zip/tgz) to cary the primary executable, and dependent libraries. Runners are
still carried as payloads inside the main binary
Darwin retain the payload model where the go binary is fully self contained.
2024-08-19 09:38:53 -07:00
Jeffrey Morgan
9fddef3731
server: limit upload parts to 16 ( #6411 )
2024-08-19 09:20:52 -07:00
Richard Lyons
885cf45087
Fix white space.
2024-08-18 03:07:16 +02:00
Richard Lyons
9352eeb752
Reset NumCtx.
2024-08-18 02:55:01 +02:00
Richard Lyons
0ad0e738cd
Override numParallel only if unset.
2024-08-18 01:43:26 +02:00
zwwhdls
bdc4308afb
fix: chmod new layer to 0o644 when creating it
...
Signed-off-by: zwwhdls <zww@hdls.me>
2024-08-16 11:43:19 +08:00
Daniel Hiltgen
d29cd4c2ed
Merge pull request #6381 from eust-w/main
...
fix: Add tooltip to system tray icon
2024-08-15 15:31:15 -07:00
eust-w
a84c05cf91
fix: Add tooltip to system tray icon
...
- Updated setIcon method to include tooltip text for the system tray icon.
- Added NIF_TIP flag and set the tooltip text using UTF16 encoding.
Resolves : #6372
2024-08-16 06:00:12 +08:00
Michael Yang
e3d7f32af7
Merge pull request #6363 from ollama/mxyng/fix-noprune
...
fix: noprune on pull
2024-08-15 12:20:38 -07:00
Michael Yang
3a75e74e34
only skip invalid json manifests
2024-08-15 10:29:14 -07:00
Michael Yang
237dccba1e
skip invalid manifest files
2024-08-14 16:55:45 -07:00
Michael Yang
b3f75fc812
fix noprune
2024-08-14 15:48:51 -07:00
Jeffrey Morgan
8200c371ae
add CONTRIBUTING.md
( #6349 )
2024-08-14 15:19:50 -07:00
longtao
0a8d6ea86d
Fix typo and improve readability ( #5964 )
...
* Fix typo and improve readability
Summary:
* Rename updatAvailableMenuID to updateAvailableMenuID
* Replace unused cmd parameter with _ in RunServer function
* Fix typos in comments
(cherry picked from commit 5b8715f0b04773369e8eb1f9e6737995a0ab3ba7)
* Update api/client.go
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-08-13 17:54:19 -07:00
Blake Mizerany
8e1050f366
server: reduce max connections used in download ( #6347 )
...
The previous value of 64 was WAY too high and unnecessary. It reached
diminishing returns and blew past it. This is a more reasonable number
for _most_ normal cases. For users on cloud servers with excellent
network quality, this will keep screaming for them, without hitting our
CDN limits. For users with relatively poor network quality, this will
keep them from saturating their network and causing other issues.
2024-08-13 16:47:35 -07:00
Bruce MacDonald
eda8a32a09
update chatml template format to latest in docs ( #6344 )
2024-08-13 16:39:18 -07:00
Michael Yang
a0a40aa20c
Merge pull request #6346 from ollama/mxyng/lint
2024-08-13 14:58:35 -07:00
Michael Yang
2697d7f5aa
lint
...
- fixes printf: non-constant format string in call to fmt.Printf
- fixes SA1032: arguments have the wrong order
- disables testifylint
2024-08-13 14:36:33 -07:00
Pamela Fox
1f32276178
Update openai.md to remove extra checkbox ( #6345 )
2024-08-13 13:36:05 -07:00
Daniel Hiltgen
4c4fe3f87f
Merge pull request #6343 from dhiltgen/revert_win_go_version
...
Go back to a pinned Go version
2024-08-13 11:53:49 -07:00
Daniel Hiltgen
feedf49c71
Go back to a pinned Go version
...
Go version 1.22.6 is triggering AV false positives, so go back to 1.22.5
2024-08-13 11:45:44 -07:00
royjhan
8b00a415ab
Load Embedding Model on Empty Input ( #6325 )
...
* load on empty input
* no load on invalid input
2024-08-13 10:19:56 -07:00
Michael Yang
01b80e9ffc
Merge pull request #5443 from ollama/mxyng/convert-phi3
...
add conversion for microsoft phi 3 mini/medium 4k, 128k
2024-08-12 15:47:58 -07:00
Michael Yang
bd5e432630
update import.md
2024-08-12 15:13:29 -07:00
Bruce MacDonald
aec77d6a05
support new "longrope" attention factor
2024-08-12 15:13:29 -07:00
Michael Yang
6ffb5cb017
add conversion for microsoft phi 3 mini/medium 4k, 128
2024-08-12 15:13:29 -07:00
Josh
f7e3b9190f
cmd: spinner progress for transfer model data ( #6100 )
2024-08-12 11:46:32 -07:00
Josh
980dd15f81
cmd: speed up gguf creates ( #6324 )
2024-08-12 11:46:09 -07:00
royjhan
01d544d373
OpenAI: Simplify input output in testing ( #5858 )
...
* simplify input output
* direct comp
* in line image
* rm error pointer type
* update response testing
* lint
2024-08-12 10:33:34 -07:00
Josh
1dc3ef3aa9
Revert "server: speed up single gguf creates ( #5898 )" ( #6323 )
...
This reverts commit 8aac22438e
.
2024-08-12 09:57:51 -07:00
Josh
8aac22438e
server: speed up single gguf creates ( #5898 )
2024-08-12 09:28:55 -07:00
Jeffrey Morgan
15c2d8fe14
server: parallelize embeddings in API web handler instead of in subprocess runner ( #6220 )
...
For simplicity, perform parallelization of embedding requests in the API handler instead of offloading this to the subprocess runner. This keeps the scheduling story simpler as it builds on existing parallel requests, similar to existing text completion functionality.
2024-08-11 11:57:10 -07:00
Daniel Hiltgen
25906d72d1
llm: prevent loading too large models on windows ( #5926 )
...
Don't allow loading models that would lead to memory exhaustion (across vram, system memory and disk paging). This check was already applied on Linux but should also be applied on Windows as well.
2024-08-11 11:30:20 -07:00
CognitiveTech
023451ce47
add integration obook-summary ( #6305 )
2024-08-10 18:43:08 -07:00
Jesse Gross
9b53e39d8e
Merge pull request #6258 from coolljt0725/fix_typo
...
server/download.go: Fix a typo in log
2024-08-09 17:19:48 -07:00
Michael Yang
97fae2df95
Merge pull request #6235 from Nicholas42/fix_line_endings
...
Set *.png and *.ico to be treated as binary files.
2024-08-09 17:06:30 -07:00
Michael Yang
160d9d4900
Merge pull request #6171 from ollama/mxyng/remove-temp
...
removeall to remove non-empty temp dirs
2024-08-09 15:47:13 -07:00
Nicholas Schwab
d4e6407464
Restrict text files with explicit line feeds to *.go.
...
This partially reverts b732beba6a
. It
seems like explicitly setting all files to use line feeds was done due
to issues with the go linter, hence it can be restricted to those files
(https://github.com/ollama/ollama/pull/6235#issuecomment-2278745953 ).
2024-08-09 23:14:13 +02:00