Jackie Li
af47413dba
Add MarshalJSON to Duration ( #3284 )
...
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
2024-05-06 15:59:18 -07:00
Patrick Devine
9009bedf13
better checking for OLLAMA_HOST variable ( #3661 )
2024-04-29 19:14:07 -04:00
Jeffrey Morgan
993cf8bf55
llm: limit generation to 10x context size to avoid run on generations ( #3918 )
...
* llm: limit generation to 10x context size to avoid run on generations
* add comment
* simplify condition statement
2024-04-25 19:02:30 -04:00
Daniel Hiltgen
34b9db5afc
Request and model concurrency
...
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
Cheng
62be2050dd
chore: use errors.New to replace fmt.Errorf will much better ( #3789 )
2024-04-20 22:11:06 -04:00
Eli Bendersky
ad90b9ab3d
api: start adding documentation to package api ( #2878 )
...
* api: start adding documentation to package api
Updates #2840
* Fix lint typo report
2024-04-10 13:31:55 -04:00
Michael Yang
01114b4526
fix: rope
2024-04-09 16:15:24 -07:00
Michael Yang
9502e5661f
cgo quantize
2024-04-08 15:31:08 -07:00
Michael Yang
e1c9a2a00f
no blob create if already exists
2024-04-08 15:09:48 -07:00
Michael Yang
be517e491c
no rope parameters
2024-04-05 18:05:27 -07:00
Patrick Devine
1b272d5bcd
change github.com/jmorganca/ollama
to github.com/ollama/ollama
( #3347 )
2024-03-26 13:04:17 -07:00
Patrick Devine
47cfe58af5
Default Keep Alive environment variable ( #3094 )
...
---------
Co-authored-by: Chris-AS1 <8493773+Chris-AS1@users.noreply.github.com>
2024-03-13 13:29:40 -07:00
Jeffrey Morgan
3b4bab3dc5
Fix embeddings load model behavior ( #2848 )
2024-02-29 17:40:56 -08:00
Ikko Eltociear Ashimine
e95b896790
Update types.go ( #2744 )
...
specfied -> specified
2024-02-25 13:41:25 -05:00
Michael Yang
897b213468
use http.DefaultClient ( #2530 )
...
default client already handles proxy
2024-02-20 18:34:47 -05:00
bnorick
caf2b13c10
Fix infinite keep_alive ( #2480 )
2024-02-13 15:40:32 -08:00
Patrick Devine
b5cf31b460
add keep_alive to generate/chat/embedding api endpoints ( #2146 )
2024-01-26 14:28:02 -08:00
Patrick Devine
7c40a67841
Save and load sessions ( #2063 )
2024-01-25 12:12:36 -08:00
Michael Yang
745b5934fa
add model to ModelResponse
2024-01-18 14:32:55 -08:00
Michael Yang
a38d88d828
api: add model for all requests
...
prefer using req.Model and fallback to req.Name
2024-01-18 14:31:37 -08:00
Michael Yang
5ffbbea1d7
remove client.py
2024-01-11 15:53:10 -08:00
Patrick Devine
22e93efa41
add show info command and fix the modelfile
2024-01-05 12:20:05 -08:00
Brian Murray
0d6e3565ae
Add embeddings to API ( #1773 )
2024-01-04 15:00:52 -05:00
Jeffrey Morgan
55978c1dc9
clean up cache api option
2023-12-27 14:27:45 -05:00
Jeffrey Morgan
d4ebdadbe7
enable cache_prompt
by default
2023-12-27 14:23:42 -05:00
K0IN
10da41d677
Add Cache flag to api ( #1642 )
2023-12-22 17:16:20 -05:00
Bruce MacDonald
d99fa6ce0a
send empty messages on last chat response ( #1530 )
2023-12-18 14:23:38 -05:00
Patrick Devine
d9e60f634b
add image support to the chat api ( #1490 )
2023-12-12 13:28:58 -08:00
Patrick Devine
910e9401d0
Multimodal support ( #1216 )
...
---------
Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local>
2023-12-11 13:56:22 -08:00
Jeffrey Morgan
9e1406e4ed
Don't expose model information in /api/generate
2023-12-09 02:05:43 -08:00
Michael Yang
c3ff36088b
Merge pull request #774 from jmorganca/mxyng/server-version
...
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
Michael Yang
5d75505ebd
return model configuration in generate
2023-12-05 14:39:02 -08:00
Bruce MacDonald
195e3d9dbd
chat api endpoint ( #1392 )
2023-12-05 14:57:33 -05:00
Michael Yang
0db4706ec2
api: add version api handler
2023-12-05 09:36:01 -08:00
Jeffrey Morgan
00d06619a1
Revert "chat api ( #991 )" while context variable is fixed
...
This reverts commit 7a0899d62d
.
2023-12-04 21:16:27 -08:00
Bruce MacDonald
7a0899d62d
chat api ( #991 )
...
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Patrick Devine
cde31cb220
Allow setting parameters in the REPL ( #1294 )
2023-11-29 09:56:42 -08:00
Bruce MacDonald
928950fcc6
update python client create example ( #1227 )
...
* add remote create to python example client
2023-11-27 15:36:19 -05:00
Michael Yang
bc22d5a38b
no blob response
2023-11-15 15:16:23 -08:00
Michael Yang
1901044b07
use checksum reference
2023-11-15 15:16:23 -08:00
Michael Yang
1552cee59f
client create modelfile
2023-11-15 15:16:23 -08:00
Michael Yang
3ca56b5ada
add create modelfile field
2023-11-15 15:16:23 -08:00
Jeffrey Morgan
cdddd3df65
add format
to example python client
2023-11-10 10:22:21 -08:00
Jeffrey Morgan
5cba29b9d6
JSON mode: add `"format" as an api parameter ( #1051 )
...
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2023-11-09 16:44:02 -08:00
Bruce MacDonald
a49d6acc1e
add a complete /generate options example ( #1035 )
2023-11-08 16:44:36 -08:00
Bruce MacDonald
ec2a31e9b3
support raw generation requests ( #952 )
...
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
Jeffrey Morgan
17678b7225
Restore system prompt on requests and default num_keep
to 0
2023-11-03 13:25:25 -07:00
Jeffrey Morgan
06589a3b30
Set NumKeep
to 4
by default ( #982 )
2023-11-02 17:26:11 -07:00
Michael Yang
1fd511e661
Merge pull request #975 from jmorganca/mxyng/downloads
...
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
Michael Yang
6db3691b8f
update default NumKeep
2023-11-02 15:47:35 -07:00