Commit graph

249 commits

Author SHA1 Message Date
Jeffrey Morgan
3b4bab3dc5
Fix embeddings load model behavior (#2848) 2024-02-29 17:40:56 -08:00
Michael Yang
0e19476b56
prepend image tags (#2789)
instead of appending image tags, prepend them - this generally produces better results
2024-02-29 11:30:14 -08:00
Jeffrey Morgan
287ba11500 better error message when calling /api/generate or /api/chat with embedding models 2024-02-20 21:53:45 -05:00
Jeffrey Morgan
63861f58cc
Support for bert and nomic-bert embedding models 2024-02-20 21:37:29 -05:00
Bruce MacDonald
88622847c6
fix: chat system prompting overrides (#2542) 2024-02-16 14:42:43 -05:00
Michael Yang
e43648afe5 rerefactor 2024-02-15 05:56:45 +00:00
Daniel Hiltgen
f397e0e988 Move hub auth out to new package 2024-02-15 05:56:45 +00:00
Jeffrey Morgan
48a273f80b
Fix issues with templating prompt in chat mode (#2460) 2024-02-12 15:06:57 -08:00
Jeffrey Morgan
1f9078d6ae
Check image filetype in api handlers (#2467) 2024-02-12 11:16:20 -08:00
Jeffrey Morgan
a0a199b108
Fix hanging issue when sending empty content (#2399) 2024-02-07 19:30:33 -05:00
Jeffrey Morgan
453f572f83
Initial OpenAI /v1/chat/completions API compatibility (#2376) 2024-02-07 17:24:29 -05:00
Michael Yang
bfbf2f7cf7
Merge pull request #2296 from ollama/mxyng/img-tags
append image tags to user content
2024-02-01 13:16:59 -08:00
Michael Yang
3d6f48507a structured debug prompt 2024-02-01 11:56:28 -08:00
Michael Yang
d125510b4b remove image tags 2024-02-01 11:32:51 -08:00
Michael Yang
fb56988014 account for image projection in token count 2024-02-01 09:50:48 -08:00
Michael Yang
d046bee790 use llm.ImageData for chat 2024-01-31 19:18:25 -08:00
Jeffrey Morgan
f11bf0740b use llm.ImageData 2024-01-31 19:13:48 -08:00
Michael Yang
8450bf66e6 trim images 2024-01-31 19:13:47 -08:00
Michael Yang
b4e11be8ef append image tags to user content 2024-01-31 19:13:10 -08:00
Michael Yang
8ac08a0eec update slog handler options
- consistent format by using text handler for debug and non-debug
- truncate source file to just the file name
2024-01-31 15:15:00 -08:00
Bruce MacDonald
0632dff3f8
trim chat prompt based on llm context size (#1963) 2024-01-30 15:59:29 -05:00
Jeffrey Morgan
f2245c7c77
print prompt with OLLAMA_DEBUG=1 (#2245) 2024-01-28 15:22:35 -08:00
Jeffrey Morgan
e4b9b72f2a
Do not repeat system prompt for chat templating (#2241) 2024-01-28 14:15:56 -08:00
Patrick Devine
b5cf31b460
add keep_alive to generate/chat/embedding api endpoints (#2146) 2024-01-26 14:28:02 -08:00
Patrick Devine
7c40a67841
Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
Michael Yang
aac9ab4db7 fix show handler 2024-01-18 15:36:50 -08:00
Michael Yang
745b5934fa add model to ModelResponse 2024-01-18 14:32:55 -08:00
Michael Yang
a38d88d828 api: add model for all requests
prefer using req.Model and fallback to req.Name
2024-01-18 14:31:37 -08:00
Daniel Hiltgen
fedd705aea Mechanical switch from log to slog
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Patrick Devine
eef50accb4
Fix show parameters (#2017) 2024-01-16 10:34:44 -08:00
Michael Yang
2b9892a808 fix(windows): modelpath and list 2024-01-09 09:36:58 -08:00
Michael Yang
2bb2bdd5d4 fix lint 2024-01-09 09:36:58 -08:00
Michael Yang
acfc376efd add .golangci.yaml 2024-01-09 09:36:58 -08:00
Michael Yang
0101e76dbe
Merge pull request #1797 from sublimator/nd-allow-extension-origins-still-needs-explicit-listing-2024-01-05
fix: allow extension origins (still needs explicit listing), fixes #1686
2024-01-05 17:20:09 -08:00
Patrick Devine
22e93efa41 add show info command and fix the modelfile 2024-01-05 12:20:05 -08:00
Nicholas Dudfield
8baaaa39c0 Allow extension origins (still needs explicit listing), fixes #1686 2024-01-05 09:06:47 +07:00
Bruce MacDonald
0b3118e0af
fix: relay request opts to loaded llm prediction (#1761) 2024-01-03 12:01:42 -05:00
Bruce MacDonald
db356c8519
post-response templating (#1427) 2023-12-22 17:07:05 -05:00
Daniel Hiltgen
35934b2e05 Adapted rocm support to cgo based llama.cpp 2023-12-19 09:05:46 -08:00
Daniel Hiltgen
d4cd695759 Add cgo implementation for llama.cpp
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Bruce MacDonald
811b1f03c8 deprecate ggml
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails

Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-12-19 09:05:46 -08:00
Bruce MacDonald
d99fa6ce0a
send empty messages on last chat response (#1530) 2023-12-18 14:23:38 -05:00
Patrick Devine
630518f0d9
Add unit test of API routes (#1528) 2023-12-14 16:47:40 -08:00
Bruce MacDonald
6ee8c80199
restore model load duration on generate response (#1524)
* restore model load duration on generate response

- set model load duration on generate and chat done response
- calculate createAt time when response created

* remove checkpoints predict opts

* Update routes.go
2023-12-14 12:15:50 -05:00
Patrick Devine
d9e60f634b
add image support to the chat api (#1490) 2023-12-12 13:28:58 -08:00
Patrick Devine
910e9401d0
Multimodal support (#1216)
---------

Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
Jeffrey Morgan
7db5bcf73b fix go-staticcheck warning 2023-12-10 11:44:27 -05:00
Jeffrey Morgan
fa2f095bd9 fix model name returned by /api/generate being different than the model name provided 2023-12-10 11:42:15 -05:00
Jeffrey Morgan
045b855db9 fix error on accumulating final chat response 2023-12-10 11:24:39 -05:00
Jeffrey Morgan
32064a0646 fix empty response when receiving runner error 2023-12-10 10:53:38 -05:00
Jeffrey Morgan
9e1406e4ed Don't expose model information in /api/generate 2023-12-09 02:05:43 -08:00
Bruce MacDonald
7e9405fd07
fix: encode full previous prompt in context (#1424) 2023-12-08 16:53:51 -05:00
Michael Yang
c3ff36088b
Merge pull request #774 from jmorganca/mxyng/server-version
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
Michael Yang
5d75505ebd return model configuration in generate 2023-12-05 14:39:02 -08:00
Michael Yang
b9495ea162 load projectors 2023-12-05 14:36:12 -08:00
Michael Yang
d3479c07a1
Merge pull request #1250 from jmorganca/mxyng/create-layer
refactor layer creation
2023-12-05 14:32:52 -08:00
Bruce MacDonald
195e3d9dbd
chat api endpoint (#1392) 2023-12-05 14:57:33 -05:00
Michael Yang
1ebdbd9694 server: add version handler 2023-12-05 09:36:01 -08:00
Jeffrey Morgan
00d06619a1 Revert "chat api (#991)" while context variable is fixed
This reverts commit 7a0899d62d.
2023-12-04 21:16:27 -08:00
Michael Yang
a3737cbd33 use NewLayer for CreateBlobHandler 2023-12-04 16:59:23 -08:00
Bruce MacDonald
7a0899d62d
chat api (#991)
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Bruce MacDonald
96122b7271
validate model tags on copy (#1323) 2023-11-29 15:54:29 -05:00
Timothy Jaeryang Baek
c2e3b89176
fix: disable ':' in tag names (#1280)
Co-authored-by: rootedbox
2023-11-29 13:33:45 -05:00
Bruce MacDonald
37d95157df
fix relative path on create (#1222) 2023-11-21 15:43:17 -05:00
Bruce MacDonald
43a726149d fix potentially inaccurate error message 2023-11-18 21:25:07 -05:00
Jeffrey Morgan
bab9494176 add - separator to temp file created on ollama create 2023-11-18 09:39:52 -05:00
Michael Yang
c6e6c8ee7e fix cross device rename 2023-11-17 15:22:17 -08:00
Michael Yang
54f92f01cb update docs 2023-11-15 15:28:15 -08:00
Michael Yang
bc22d5a38b no blob response 2023-11-15 15:16:23 -08:00
Michael Yang
1901044b07 use checksum reference 2023-11-15 15:16:23 -08:00
Michael Yang
1552cee59f client create modelfile 2023-11-15 15:16:23 -08:00
Michael Yang
3ca56b5ada add create modelfile field 2023-11-15 15:16:23 -08:00
Michael Yang
b0d14ed51c refactor create model 2023-11-15 15:16:23 -08:00
Jeffrey Morgan
5cba29b9d6
JSON mode: add `"format" as an api parameter (#1051)
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-09 16:44:02 -08:00
Bruce MacDonald
ec2a31e9b3
support raw generation requests (#952)
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
Noah Gitsham
8ae8c9fa8c
Remove duplicate "install" in GPU support warning (#984) 2023-11-03 00:45:14 -07:00
Noah Gitsham
f39daff461
Add missing "be" to GPU support warning message (#983) 2023-11-02 18:37:12 -07:00
Michael Yang
2c6189f4fe
Merge pull request #750 from jmorganca/mxyng/concurrent-uploads
concurrent uploads
2023-11-01 15:00:01 -07:00
Bruce MacDonald
f9a4281124
clean up: remove server functions from client (#937) 2023-10-30 11:10:18 -04:00
Michael Yang
4e09aab8b9 concurrent uploads 2023-10-27 17:07:33 -07:00
Michael Yang
386169205c
update runtime options (#864) 2023-10-20 21:17:14 -04:00
Jeffrey Morgan
7ed5a39bc7 simpler check for model loading compatibility errors 2023-10-19 14:50:49 -04:00
Michael Yang
e1c5be24e7 check json eof 2023-10-19 09:21:51 -07:00
Michael Yang
2ad8a074ac generate: set created_at
move the empty response so it's more visible
2023-10-19 09:21:51 -07:00
Michael Yang
7e547c6833 s/message/error/ 2023-10-19 09:21:04 -07:00
Michael Yang
689842b9ff request: bad request when model missing fields 2023-10-19 09:21:04 -07:00
Michael Yang
a19d47642e models: rm workDir from CreateModel
unused after removing EMBED
2023-10-19 09:21:04 -07:00
Bruce MacDonald
fe6f3b48f7
do not reload the running llm when runtime params change (#840)
- only reload the running llm if the model has changed, or the options for loading the running model have changed
- rename loaded llm to runner to differentiate from loaded model image
- remove logic which keeps the first system prompt in the generation context
2023-10-19 10:39:58 -04:00
Yiorgis Gozadinos
8c6c2cbc8c When the .ollama folder is broken or there are no models return an empty list on /api/tags 2023-10-18 08:23:20 +02:00
Michael Yang
1af493c5a0 server: print version on start 2023-10-16 09:59:14 -07:00
Bruce MacDonald
a0c3e989de
deprecate modelfile embed command (#759) 2023-10-16 11:07:37 -04:00
Bruce MacDonald
7804b8fab9
validate api options fields from map (#711) 2023-10-12 11:18:11 -04:00
Bruce MacDonald
274d5a5fdf
optional parameter to not stream response (#639)
* update streaming request accept header
* add optional stream param to request bodies
2023-10-11 12:54:27 -04:00
Bruce MacDonald
af4cf55884
not found error before pulling model (#718) 2023-10-06 16:06:20 -04:00
Jay Nakrani
1d0ebe67e8
Document response stream chunk delimiter. (#632)
Document response stream chunk delimiter.
2023-09-29 21:45:52 -07:00
Bruce MacDonald
a1b2d95f96
remove unused push/pull params (#650) 2023-09-29 17:27:19 -04:00
Michael Yang
8608eb4760 prune empty directories 2023-09-27 10:58:09 -07:00
Jeffrey Morgan
9b12a511ca check other request fields before load short circuit in /api/generate 2023-09-22 23:50:55 -04:00
Bruce MacDonald
5d71bda478
close llm on interrupt (#577) 2023-09-22 19:41:52 +01:00
Michael Yang
82f5b66c01 register HEAD /api/tags 2023-09-21 16:38:03 -07:00