Michael Yang
b25976aeb8
routes: fix show llava models
2024-05-08 12:43:36 -07:00
Michael Yang
eeb695261f
skip if same quantization
2024-05-07 17:44:19 -07:00
Michael Yang
548a7df014
update list handler to use model.Name
2024-05-07 09:38:45 -07:00
Michael Yang
d245460362
only quantize language models
2024-05-06 15:24:01 -07:00
Michael Yang
4d0d0fa383
no iterator
2024-05-06 15:24:01 -07:00
Michael Yang
7ffe45734d
rebase
2024-05-06 15:24:01 -07:00
Michael Yang
01811c176a
comments
2024-05-06 15:24:01 -07:00
Michael Yang
9685c34509
quantize any fp16/fp32 model
...
- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}
2024-05-06 15:24:01 -07:00
Daniel Hiltgen
f56aa20014
Centralize server config handling
...
This moves all the env var reading into one central module
and logs the loaded config once at startup which should
help in troubleshooting user server logs
2024-05-05 16:49:50 -07:00
Michael Yang
119589fcb3
rename parser to model/file
2024-05-01 09:53:50 -07:00
Michael Yang
9cf0f2e973
use parser.Format instead of templating modelfile
2024-05-01 09:52:54 -07:00
Bruce MacDonald
0a7fdbe533
prompt to display and add local ollama keys to account ( #3717 )
...
- return descriptive error messages when unauthorized to create blob or push a model
- display the local public key associated with the request that was denied
2024-04-30 11:02:08 -07:00
Jeffrey Morgan
586672f490
fix copying model to itself ( #4019 )
2024-04-28 23:47:49 -04:00
Blake Mizerany
37f9c8ad99
types/model: overhaul Name and Digest types ( #3924 )
2024-04-26 13:08:32 -07:00
Michael Yang
592dae31c8
update copy to use model.Name
2024-04-24 15:54:54 -07:00
Cheng
62be2050dd
chore: use errors.New to replace fmt.Errorf will much better ( #3789 )
2024-04-20 22:11:06 -04:00
Patrick Devine
9f8691c6c8
Add llama2 / torch models for ollama create
( #3607 )
2024-04-15 11:26:42 -07:00
Michael Yang
9502e5661f
cgo quantize
2024-04-08 15:31:08 -07:00
Patrick Devine
3b6a9154dd
Simplify model conversion ( #3422 )
2024-04-01 16:14:53 -07:00
Michael Yang
d338d70492
refactor model parsing
2024-04-01 13:16:15 -07:00
Patrick Devine
5a5efee46b
Add gemma safetensors conversion ( #3250 )
...
Co-authored-by: Michael Yang <mxyng@pm.me>
2024-03-28 18:54:01 -07:00
Patrick Devine
1b272d5bcd
change github.com/jmorganca/ollama
to github.com/ollama/ollama
( #3347 )
2024-03-26 13:04:17 -07:00
Blake Mizerany
703684a82a
server: replace blob prefix separator from ':' to '-' ( #3146 )
...
This fixes issues with blob file names that contain ':' characters to be rejected by file systems that do not support them.
2024-03-14 20:18:06 -07:00
Michael Yang
76bdebbadf
decode ggla
2024-03-08 15:46:25 -08:00
Bruce MacDonald
0cebc79cba
fix: allow importing a model from name reference ( #3005 )
2024-03-08 12:27:47 -05:00
Patrick Devine
2c017ca441
Convert Safetensors to an Ollama model ( #2824 )
2024-03-06 21:01:51 -08:00
Jeffrey Morgan
63861f58cc
Support for bert
and nomic-bert
embedding models
2024-02-20 21:37:29 -05:00
Michael Yang
897b213468
use http.DefaultClient ( #2530 )
...
default client already handles proxy
2024-02-20 18:34:47 -05:00
Michael Yang
e43648afe5
rerefactor
2024-02-15 05:56:45 +00:00
Daniel Hiltgen
f397e0e988
Move hub auth out to new package
2024-02-15 05:56:45 +00:00
Jeffrey Morgan
48a273f80b
Fix issues with templating prompt in chat mode ( #2460 )
2024-02-12 15:06:57 -08:00
Jeffrey Morgan
a0a199b108
Fix hanging issue when sending empty content ( #2399 )
2024-02-07 19:30:33 -05:00
Michael Yang
f3761405c8
use image id
2024-02-01 11:52:42 -08:00
Michael Yang
d125510b4b
remove image tags
2024-02-01 11:32:51 -08:00
Michael Yang
d046bee790
use llm.ImageData for chat
2024-01-31 19:18:25 -08:00
Michael Yang
8450bf66e6
trim images
2024-01-31 19:13:47 -08:00
Michael Yang
b4e11be8ef
append image tags to user content
2024-01-31 19:13:10 -08:00
Bruce MacDonald
a896079705
preserve last system message from modelfile ( #2289 )
2024-01-31 21:45:01 -05:00
Bruce MacDonald
0632dff3f8
trim chat prompt based on llm context size ( #1963 )
2024-01-30 15:59:29 -05:00
Jeffrey Morgan
e4b9b72f2a
Do not repeat system prompt for chat templating ( #2241 )
2024-01-28 14:15:56 -08:00
Patrick Devine
7c40a67841
Save and load sessions ( #2063 )
2024-01-25 12:12:36 -08:00
Michael Yang
c08dfaa23d
fix: remove overwritten model layers
...
if create overrides a manifest, first add the older manifest's layers to
the delete map so they can be cleaned up
2024-01-19 14:58:37 -08:00
Daniel Hiltgen
fedd705aea
Mechanical switch from log to slog
...
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Michael Yang
cf29bd2d72
fix: request retry with error
...
this fixes a subtle bug with makeRequestWithRetry where an HTTP status
error on a retried request will potentially not return the right err
2024-01-12 13:32:27 -08:00
Michael Yang
2bb2bdd5d4
fix lint
2024-01-09 09:36:58 -08:00
Bruce MacDonald
7e8f7c8358
remove ggml automatic re-pull ( #1856 )
2024-01-08 14:41:01 -05:00
Bruce MacDonald
4ad6c9b11f
fix: pull either original model or from model on create ( #1774 )
2024-01-04 01:34:38 -05:00
Bruce MacDonald
db356c8519
post-response templating ( #1427 )
2023-12-22 17:07:05 -05:00
Bruce MacDonald
5e7fd6906f
Update images.go
2023-12-19 09:05:46 -08:00
Bruce MacDonald
811b1f03c8
deprecate ggml
...
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00