Blake Mizerany
c631a9c726
types/model: relax name length constraint from 2 to 1 ( #3984 )
2024-04-27 17:58:41 -07:00
Blake Mizerany
8fd9e56804
types/structs: drop unused structs package ( #3981 )
2024-04-27 14:06:11 -07:00
Hernan Martinez
8a65717f55
Do not build AVX runners on ARM64
2024-04-26 23:55:32 -06:00
Hernan Martinez
6d3152a98a
Use architecture specific folders in installer script
2024-04-26 23:35:16 -06:00
Hernan Martinez
b438d485f1
Use architecture specific folders in the generate script
2024-04-26 23:34:12 -06:00
Hernan Martinez
204349b17b
Use architecture specific folders in the build script
2024-04-26 23:26:03 -06:00
Hernan Martinez
86e67fc4a9
Add import declaration for windows,arm64 to llm.go
2024-04-26 23:23:53 -06:00
Blake Mizerany
2bed62926e
types/model: remove Digest (for now) ( #3970 )
...
The Digest type needs more thought and is not necessary at the moment.
2024-04-26 21:14:28 -07:00
Jeffrey Morgan
aad8d128a0
also look at cwd as a root for windows runners ( #3959 )
2024-04-26 19:14:08 -04:00
Daniel Hiltgen
ec1acbb867
Merge pull request #3968 from dhiltgen/win_generate
...
Fine grain control over windows generate steps
2024-04-26 16:03:38 -07:00
Daniel Hiltgen
e4859c4563
Fine grain control over windows generate steps
...
This will speed up CI which already tries to only build static for unit tests
2024-04-26 15:49:46 -07:00
Nataly Merezhuk
8e30eb26bd
Updates the setup command to use llama3. ( #3962 )
2024-04-26 18:41:01 -04:00
Daniel Hiltgen
0b5c589ca2
Merge pull request #3966 from dhiltgen/bump
...
Fix target in gen_windows.ps1
2024-04-26 15:36:53 -07:00
Michael Yang
65fadddc85
Merge pull request #3964 from ollama/mxyng/weights
...
fix gemma, command-r layer weights
2024-04-26 15:23:33 -07:00
Daniel Hiltgen
ed5fb088c4
Fix target in gen_windows.ps1
2024-04-26 15:10:42 -07:00
Michael Yang
f81f308118
fix gemma, command-r layer weights
2024-04-26 15:00:55 -07:00
Blake Mizerany
b1390a7b37
types/model: export ParseNameBare and Merge ( #3957 )
...
These are useful outside this package.
2024-04-26 14:58:07 -07:00
Michael Yang
11d83386a5
Merge pull request #3951 from ollama/mxyng/zip
...
check file type before zip
2024-04-26 14:51:23 -07:00
Jeffrey Morgan
bb31def011
return code 499
when user cancels request while a model is loading ( #3955 )
2024-04-26 17:38:29 -04:00
Michael Yang
41e03ede95
check file type before zip
2024-04-26 14:18:07 -07:00
Michael Yang
7fea1ecdf6
Merge pull request #3958 from ollama/mxyng/fix-workflow
...
use merge base for diff-tree
2024-04-26 14:17:56 -07:00
Blake Mizerany
054894271d
.github/workflows/test.yaml: add in-flight cancellations on new push ( #3956 )
...
Also, remove a superfluous 'go get'
2024-04-26 13:54:24 -07:00
Michael Yang
6fef042f0b
use merge base for diff-tree
2024-04-26 13:54:15 -07:00
Daniel Hiltgen
5c0c2d1d09
Merge pull request #3954 from dhiltgen/ci_fixes
...
Put back non-avx CPU build for windows
2024-04-26 13:09:03 -07:00
Blake Mizerany
37f9c8ad99
types/model: overhaul Name and Digest types ( #3924 )
2024-04-26 13:08:32 -07:00
Quinten van Buul
2a80f55e2a
Update windows.md ( #3855 )
...
Fixed a typo
2024-04-26 16:04:15 -04:00
Daniel Hiltgen
421c878a2d
Put back non-avx CPU build for windows
2024-04-26 12:44:07 -07:00
Daniel Hiltgen
36666c2142
Merge pull request #3925 from dhiltgen/bump
...
Bump llama.cpp to b2737
2024-04-26 10:09:38 -07:00
Daniel Hiltgen
85801317d1
Fix clip log import
2024-04-26 09:43:46 -07:00
Daniel Hiltgen
2ed0d65948
Bump llama.cpp to b2737
2024-04-26 09:43:28 -07:00
Daniel Hiltgen
d459dc4ad1
Merge pull request #3950 from dhiltgen/windows_packaging
...
Fix exe name for zip packaging on windows
2024-04-26 09:27:37 -07:00
Daniel Hiltgen
40bc4622ef
Fix exe name for zip packaging on windows
...
The zip file encodes the OS and architecture, so keep the short exe name
2024-04-26 09:18:05 -07:00
Daniel Hiltgen
c0f818a07a
Merge pull request #3948 from dhiltgen/win_generate
...
Refactor windows generate for more modular usage
2024-04-26 09:17:20 -07:00
Daniel Hiltgen
8671fdeda6
Refactor windows generate for more modular usage
2024-04-26 08:35:50 -07:00
Daniel Hiltgen
2619850fb4
Merge pull request #3933 from dhiltgen/ci_fixes
...
Move cuda/rocm dependency gathering into generate script
2024-04-26 07:01:24 -07:00
Daniel Hiltgen
8feb97dc0d
Move cuda/rocm dependency gathering into generate script
...
This will make it simpler for CI to accumulate artifacts from prior steps
2024-04-25 22:38:44 -07:00
Daniel Hiltgen
4e1ff6dcbb
Merge pull request #3926 from dhiltgen/ci_fixes
...
Fix release CI
2024-04-25 17:42:31 -07:00
Daniel Hiltgen
8589d752ac
Fix release CI
...
download-artifact path was being used incorrectly. It is where to
extract the zip not the files in the zip to extract. Default is
workspace dir which is what we want, so omit it
2024-04-25 17:27:11 -07:00
Michael Yang
de4ded68b0
Merge pull request #3923 from ollama/mxyng/mem
...
only count output tensors
2024-04-25 16:34:17 -07:00
Daniel Hiltgen
9b5a3c5991
Merge pull request #3914 from dhiltgen/mac_perf
...
Improve mac parallel performance
2024-04-25 16:28:31 -07:00
Jeffrey Morgan
00b0699c75
Reload model if num_gpu
changes ( #3920 )
...
* reload model if `num_gpu` changes
* dont reload on -1
* fix tests
2024-04-25 19:02:40 -04:00
Jeffrey Morgan
993cf8bf55
llm: limit generation to 10x context size to avoid run on generations ( #3918 )
...
* llm: limit generation to 10x context size to avoid run on generations
* add comment
* simplify condition statement
2024-04-25 19:02:30 -04:00
Michael Yang
7bb7cb8a60
only count output tensors
2024-04-25 15:24:08 -07:00
Daniel Hiltgen
b123be5b71
Adjust context size for parallelism
2024-04-25 13:58:54 -07:00
jmorganca
ddf5c09a9b
use matrix multiplcation kernels in more cases
2024-04-25 13:58:54 -07:00
Roy Yang
5f73c08729
Remove trailing spaces ( #3889 )
2024-04-25 14:32:26 -04:00
Daniel Hiltgen
f503a848c2
Merge pull request #3895 from brycereitano/shiftloading
...
Move ggml loading to when attempting to fit
2024-04-25 09:24:08 -07:00
Bryce Reitano
36a6daccab
Restructure loading conditional chain
2024-04-24 17:37:03 -06:00
Bryce Reitano
ceb0e26e5e
Provide variable ggml for TestLoad
2024-04-24 17:19:55 -06:00
Bryce Reitano
284e02bed0
Move ggml loading to when we attempt fitting
2024-04-24 17:17:24 -06:00