Jeffrey Morgan
3b4bab3dc5
Fix embeddings load model behavior ( #2848 )
2024-02-29 17:40:56 -08:00
Michael Yang
0e19476b56
prepend image tags ( #2789 )
...
instead of appending image tags, prepend them - this generally produces better results
2024-02-29 11:30:14 -08:00
Jeffrey Morgan
287ba11500
better error message when calling /api/generate
or /api/chat
with embedding models
2024-02-20 21:53:45 -05:00
Jeffrey Morgan
63861f58cc
Support for bert
and nomic-bert
embedding models
2024-02-20 21:37:29 -05:00
Bruce MacDonald
88622847c6
fix: chat system prompting overrides ( #2542 )
2024-02-16 14:42:43 -05:00
Michael Yang
e43648afe5
rerefactor
2024-02-15 05:56:45 +00:00
Daniel Hiltgen
f397e0e988
Move hub auth out to new package
2024-02-15 05:56:45 +00:00
Jeffrey Morgan
48a273f80b
Fix issues with templating prompt in chat mode ( #2460 )
2024-02-12 15:06:57 -08:00
Jeffrey Morgan
1f9078d6ae
Check image filetype in api handlers ( #2467 )
2024-02-12 11:16:20 -08:00
Jeffrey Morgan
a0a199b108
Fix hanging issue when sending empty content ( #2399 )
2024-02-07 19:30:33 -05:00
Jeffrey Morgan
453f572f83
Initial OpenAI /v1/chat/completions
API compatibility ( #2376 )
2024-02-07 17:24:29 -05:00
Michael Yang
bfbf2f7cf7
Merge pull request #2296 from ollama/mxyng/img-tags
...
append image tags to user content
2024-02-01 13:16:59 -08:00
Michael Yang
3d6f48507a
structured debug prompt
2024-02-01 11:56:28 -08:00
Michael Yang
d125510b4b
remove image tags
2024-02-01 11:32:51 -08:00
Michael Yang
fb56988014
account for image projection in token count
2024-02-01 09:50:48 -08:00
Michael Yang
d046bee790
use llm.ImageData for chat
2024-01-31 19:18:25 -08:00
Jeffrey Morgan
f11bf0740b
use llm.ImageData
2024-01-31 19:13:48 -08:00
Michael Yang
8450bf66e6
trim images
2024-01-31 19:13:47 -08:00
Michael Yang
b4e11be8ef
append image tags to user content
2024-01-31 19:13:10 -08:00
Michael Yang
8ac08a0eec
update slog handler options
...
- consistent format by using text handler for debug and non-debug
- truncate source file to just the file name
2024-01-31 15:15:00 -08:00
Bruce MacDonald
0632dff3f8
trim chat prompt based on llm context size ( #1963 )
2024-01-30 15:59:29 -05:00
Jeffrey Morgan
f2245c7c77
print prompt with OLLAMA_DEBUG=1
( #2245 )
2024-01-28 15:22:35 -08:00
Jeffrey Morgan
e4b9b72f2a
Do not repeat system prompt for chat templating ( #2241 )
2024-01-28 14:15:56 -08:00
Patrick Devine
b5cf31b460
add keep_alive to generate/chat/embedding api endpoints ( #2146 )
2024-01-26 14:28:02 -08:00
Patrick Devine
7c40a67841
Save and load sessions ( #2063 )
2024-01-25 12:12:36 -08:00
Michael Yang
aac9ab4db7
fix show handler
2024-01-18 15:36:50 -08:00
Michael Yang
745b5934fa
add model to ModelResponse
2024-01-18 14:32:55 -08:00
Michael Yang
a38d88d828
api: add model for all requests
...
prefer using req.Model and fallback to req.Name
2024-01-18 14:31:37 -08:00
Daniel Hiltgen
fedd705aea
Mechanical switch from log to slog
...
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Patrick Devine
eef50accb4
Fix show parameters ( #2017 )
2024-01-16 10:34:44 -08:00
Michael Yang
2b9892a808
fix(windows): modelpath and list
2024-01-09 09:36:58 -08:00
Michael Yang
2bb2bdd5d4
fix lint
2024-01-09 09:36:58 -08:00
Michael Yang
acfc376efd
add .golangci.yaml
2024-01-09 09:36:58 -08:00
Michael Yang
0101e76dbe
Merge pull request #1797 from sublimator/nd-allow-extension-origins-still-needs-explicit-listing-2024-01-05
...
fix: allow extension origins (still needs explicit listing), fixes #1686
2024-01-05 17:20:09 -08:00
Patrick Devine
22e93efa41
add show info command and fix the modelfile
2024-01-05 12:20:05 -08:00
Nicholas Dudfield
8baaaa39c0
Allow extension origins (still needs explicit listing), fixes #1686
2024-01-05 09:06:47 +07:00
Bruce MacDonald
0b3118e0af
fix: relay request opts to loaded llm prediction ( #1761 )
2024-01-03 12:01:42 -05:00
Bruce MacDonald
db356c8519
post-response templating ( #1427 )
2023-12-22 17:07:05 -05:00
Daniel Hiltgen
35934b2e05
Adapted rocm support to cgo based llama.cpp
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
d4cd695759
Add cgo implementation for llama.cpp
...
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Bruce MacDonald
811b1f03c8
deprecate ggml
...
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00
Bruce MacDonald
d99fa6ce0a
send empty messages on last chat response ( #1530 )
2023-12-18 14:23:38 -05:00
Patrick Devine
630518f0d9
Add unit test of API routes ( #1528 )
2023-12-14 16:47:40 -08:00
Bruce MacDonald
6ee8c80199
restore model load duration on generate response ( #1524 )
...
* restore model load duration on generate response
- set model load duration on generate and chat done response
- calculate createAt time when response created
* remove checkpoints predict opts
* Update routes.go
2023-12-14 12:15:50 -05:00
Patrick Devine
d9e60f634b
add image support to the chat api ( #1490 )
2023-12-12 13:28:58 -08:00
Patrick Devine
910e9401d0
Multimodal support ( #1216 )
...
---------
Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
Jeffrey Morgan
7db5bcf73b
fix go-staticcheck
warning
2023-12-10 11:44:27 -05:00
Jeffrey Morgan
fa2f095bd9
fix model name returned by /api/generate
being different than the model name provided
2023-12-10 11:42:15 -05:00
Jeffrey Morgan
045b855db9
fix error on accumulating final chat response
2023-12-10 11:24:39 -05:00
Jeffrey Morgan
32064a0646
fix empty response when receiving runner error
2023-12-10 10:53:38 -05:00
Jeffrey Morgan
9e1406e4ed
Don't expose model information in /api/generate
2023-12-09 02:05:43 -08:00
Bruce MacDonald
7e9405fd07
fix: encode full previous prompt in context ( #1424 )
2023-12-08 16:53:51 -05:00
Michael Yang
c3ff36088b
Merge pull request #774 from jmorganca/mxyng/server-version
...
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
Michael Yang
5d75505ebd
return model configuration in generate
2023-12-05 14:39:02 -08:00
Michael Yang
b9495ea162
load projectors
2023-12-05 14:36:12 -08:00
Michael Yang
d3479c07a1
Merge pull request #1250 from jmorganca/mxyng/create-layer
...
refactor layer creation
2023-12-05 14:32:52 -08:00
Bruce MacDonald
195e3d9dbd
chat api endpoint ( #1392 )
2023-12-05 14:57:33 -05:00
Michael Yang
1ebdbd9694
server: add version handler
2023-12-05 09:36:01 -08:00
Jeffrey Morgan
00d06619a1
Revert "chat api ( #991 )" while context variable is fixed
...
This reverts commit 7a0899d62d
.
2023-12-04 21:16:27 -08:00
Michael Yang
a3737cbd33
use NewLayer for CreateBlobHandler
2023-12-04 16:59:23 -08:00
Bruce MacDonald
7a0899d62d
chat api ( #991 )
...
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Bruce MacDonald
96122b7271
validate model tags on copy ( #1323 )
2023-11-29 15:54:29 -05:00
Timothy Jaeryang Baek
c2e3b89176
fix: disable ':' in tag names ( #1280 )
...
Co-authored-by: rootedbox
2023-11-29 13:33:45 -05:00
Bruce MacDonald
37d95157df
fix relative path on create ( #1222 )
2023-11-21 15:43:17 -05:00
Bruce MacDonald
43a726149d
fix potentially inaccurate error message
2023-11-18 21:25:07 -05:00
Jeffrey Morgan
bab9494176
add -
separator to temp file created on ollama create
2023-11-18 09:39:52 -05:00
Michael Yang
c6e6c8ee7e
fix cross device rename
2023-11-17 15:22:17 -08:00
Michael Yang
54f92f01cb
update docs
2023-11-15 15:28:15 -08:00
Michael Yang
bc22d5a38b
no blob response
2023-11-15 15:16:23 -08:00
Michael Yang
1901044b07
use checksum reference
2023-11-15 15:16:23 -08:00
Michael Yang
1552cee59f
client create modelfile
2023-11-15 15:16:23 -08:00
Michael Yang
3ca56b5ada
add create modelfile field
2023-11-15 15:16:23 -08:00
Michael Yang
b0d14ed51c
refactor create model
2023-11-15 15:16:23 -08:00
Jeffrey Morgan
5cba29b9d6
JSON mode: add `"format" as an api parameter ( #1051 )
...
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-09 16:44:02 -08:00
Bruce MacDonald
ec2a31e9b3
support raw generation requests ( #952 )
...
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
Noah Gitsham
8ae8c9fa8c
Remove duplicate "install" in GPU support warning ( #984 )
2023-11-03 00:45:14 -07:00
Noah Gitsham
f39daff461
Add missing "be" to GPU support warning message ( #983 )
2023-11-02 18:37:12 -07:00
Michael Yang
2c6189f4fe
Merge pull request #750 from jmorganca/mxyng/concurrent-uploads
...
concurrent uploads
2023-11-01 15:00:01 -07:00
Bruce MacDonald
f9a4281124
clean up: remove server functions from client ( #937 )
2023-10-30 11:10:18 -04:00
Michael Yang
4e09aab8b9
concurrent uploads
2023-10-27 17:07:33 -07:00
Michael Yang
386169205c
update runtime options ( #864 )
2023-10-20 21:17:14 -04:00
Jeffrey Morgan
7ed5a39bc7
simpler check for model loading compatibility errors
2023-10-19 14:50:49 -04:00
Michael Yang
e1c5be24e7
check json eof
2023-10-19 09:21:51 -07:00
Michael Yang
2ad8a074ac
generate: set created_at
...
move the empty response so it's more visible
2023-10-19 09:21:51 -07:00
Michael Yang
7e547c6833
s/message/error/
2023-10-19 09:21:04 -07:00
Michael Yang
689842b9ff
request: bad request when model missing fields
2023-10-19 09:21:04 -07:00
Michael Yang
a19d47642e
models: rm workDir from CreateModel
...
unused after removing EMBED
2023-10-19 09:21:04 -07:00
Bruce MacDonald
fe6f3b48f7
do not reload the running llm when runtime params change ( #840 )
...
- only reload the running llm if the model has changed, or the options for loading the running model have changed
- rename loaded llm to runner to differentiate from loaded model image
- remove logic which keeps the first system prompt in the generation context
2023-10-19 10:39:58 -04:00
Yiorgis Gozadinos
8c6c2cbc8c
When the .ollama folder is broken or there are no models return an empty list on /api/tags
2023-10-18 08:23:20 +02:00
Michael Yang
1af493c5a0
server: print version on start
2023-10-16 09:59:14 -07:00
Bruce MacDonald
a0c3e989de
deprecate modelfile embed command ( #759 )
2023-10-16 11:07:37 -04:00
Bruce MacDonald
7804b8fab9
validate api options fields from map ( #711 )
2023-10-12 11:18:11 -04:00
Bruce MacDonald
274d5a5fdf
optional parameter to not stream response ( #639 )
...
* update streaming request accept header
* add optional stream param to request bodies
2023-10-11 12:54:27 -04:00
Bruce MacDonald
af4cf55884
not found error before pulling model ( #718 )
2023-10-06 16:06:20 -04:00
Jay Nakrani
1d0ebe67e8
Document response stream chunk delimiter. ( #632 )
...
Document response stream chunk delimiter.
2023-09-29 21:45:52 -07:00
Bruce MacDonald
a1b2d95f96
remove unused push/pull params ( #650 )
2023-09-29 17:27:19 -04:00
Michael Yang
8608eb4760
prune empty directories
2023-09-27 10:58:09 -07:00
Jeffrey Morgan
9b12a511ca
check other request fields before load short circuit in /api/generate
2023-09-22 23:50:55 -04:00
Bruce MacDonald
5d71bda478
close llm on interrupt ( #577 )
2023-09-22 19:41:52 +01:00
Michael Yang
82f5b66c01
register HEAD /api/tags
2023-09-21 16:38:03 -07:00