Commit graph

247 commits

Author SHA1 Message Date
Daniel Hiltgen
cc269ba094 Remove no longer supported max vram var
The OLLAMA_MAX_VRAM env var was a temporary workaround for OOM
scenarios.  With Concurrency this was no longer wired up, and the simplistic
value doesn't map to multi-GPU setups.  Users can still set `num_gpu`
to limit memory usage to avoid OOM if we get our predictions wrong.
2024-07-22 09:08:11 -07:00
Patrick Devine
057d31861e
remove template (#5655) 2024-07-13 20:56:24 -07:00
Patrick Devine
23ebbaa46e Revert "remove template from tests"
This reverts commit 9ac0a7a50b.
2024-07-12 15:47:17 -07:00
Patrick Devine
9ac0a7a50b remove template from tests 2024-07-12 15:41:31 -07:00
royjhan
5f034f5b63
Include Show Info in Interactive (#5342) 2024-06-28 13:15:52 -07:00
royjhan
b910fa9010
Ollama Show: Check for Projector Type (#5307)
* Check exists projtype

* Maintain Ordering
2024-06-28 11:30:16 -07:00
Michael Yang
123a722a6f
zip: prevent extracting files into parent dirs (#5314) 2024-06-26 21:38:21 -07:00
Blake Mizerany
2aa91a937b
cmd: defer stating model info until necessary (#5248)
This commit changes the 'ollama run' command to defer fetching model
information until it really needs it. That is, when in interactive mode.

It also removes one such case where the model information is fetch in
duplicate, just before calling generateInteractive and then again, first
thing, in generateInteractive.

This positively impacts the performance of the command:

    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.168 total
    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.220 total
    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.217 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 4% cpu 0.652 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.01s user 0.01s system 5% cpu 0.498 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?

    ./after run llama3 'hi'  0.01s user 0.01s system 3% cpu 0.479 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 5% cpu 0.507 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 5% cpu 0.507 total
2024-06-24 20:14:03 -07:00
royjhan
fedf71635e
Extend api/show and ollama show to return more model info (#4881)
* API Show Extended

* Initial Draft of Information

Co-Authored-By: Patrick Devine <pdevine@sonic.net>

* Clean Up

* Descriptive arg error messages and other fixes

* Second Draft of Show with Projectors Included

* Remove Chat Template

* Touches

* Prevent wrapping from files

* Verbose functionality

* Docs

* Address Feedback

* Lint

* Resolve Conflicts

* Function Name

* Tests for api/show model info

* Show Test File

* Add Projector Test

* Clean routes

* Projector Check

* Move Show Test

* Touches

* Doc update

---------

Co-authored-by: Patrick Devine <pdevine@sonic.net>
2024-06-19 14:19:02 -07:00
Patrick Devine
c69bc19e46
move OLLAMA_HOST to envconfig (#5009) 2024-06-12 18:48:16 -04:00
Michael Yang
201d853fdf nolintlint 2024-06-04 11:13:30 -07:00
Michael Yang
e40145a39d lint 2024-06-04 11:13:30 -07:00
Michael Yang
8ffb51749f nolintlint 2024-06-04 11:13:30 -07:00
Michael Yang
04f3c12bb7 replace x/exp/slices with slices 2024-06-04 11:13:30 -07:00
Josh Yan
914f68f021 replaced duplicate call with variable 2024-05-30 10:38:07 -07:00
Josh Yan
bd1d119ba9 fixed japanese characters deleted at end of line 2024-05-30 10:24:21 -07:00
Lei Jitang
a03be18189
Fix OLLAMA_LLM_LIBRARY with wrong map name and add more env vars to help message (#4663)
* envconfig/config.go: Fix wrong description of OLLAMA_LLM_LIBRARY

Signed-off-by: Lei Jitang <leijitang@outlook.com>

* serve: Add more env to help message of ollama serve

Add more enviroment variables to `ollama serve --help`
to let users know what can be configurated.

Signed-off-by: Lei Jitang <leijitang@outlook.com>

---------

Signed-off-by: Lei Jitang <leijitang@outlook.com>
2024-05-30 09:36:51 -07:00
Patrick Devine
4cc3be3035
Move envconfig and consolidate env vars (#4608) 2024-05-24 14:57:15 -07:00
Josh
9f18b88a06
Merge pull request #4566 from ollama/jyan/shortcuts
add Ctrl + W shortcut
2024-05-21 22:49:36 -07:00
Josh Yan
353f83a9c7 add Ctrl + W shortcut 2024-05-21 16:55:09 -07:00
Patrick Devine
d355d2020f add fixes for llama 2024-05-20 16:13:57 -07:00
Patrick Devine
ccdf0b2a44
Move the parser back + handle utf16 files (#4533) 2024-05-20 11:26:45 -07:00
Patrick Devine
105186aa17
add OLLAMA_NOHISTORY to turn off history in interactive mode (#4508) 2024-05-18 11:51:57 -07:00
Josh Yan
3d90156e99 removed comment 2024-05-16 14:12:03 -07:00
Josh Yan
26bfc1c443 go fmt'd cmd.go 2024-05-15 17:26:39 -07:00
Josh Yan
799aa9883c go fmt'd cmd.go 2024-05-15 17:24:17 -07:00
Josh Yan
c9e584fb90 updated double-width display 2024-05-15 16:45:24 -07:00
Josh Yan
17b1e81ca1 fixed width and word count for double spacing 2024-05-15 16:29:33 -07:00
Patrick Devine
c344da4c5a
fix keepalive for non-interactive mode (#4438) 2024-05-14 15:17:04 -07:00
Patrick Devine
a4b8d1f89a
re-add system context (#4435) 2024-05-14 11:38:20 -07:00
Patrick Devine
7ca71a6b0f
don't abort when an invalid model name is used in /save (#4416) 2024-05-13 18:48:28 -07:00
Patrick Devine
6845988807
Ollama ps command for showing currently loaded models (#4327) 2024-05-13 17:17:36 -07:00
Josh Yan
f8464785a6 removed inconsistencies 2024-05-13 14:50:52 -07:00
Josh Yan
91a090a485 removed inconsistent punctuation 2024-05-13 14:08:22 -07:00
todashuta
8080fbce35
fix ollama create's usage string (#4362) 2024-05-11 14:47:49 -07:00
Jeffrey Morgan
6602e793c0
Use --quantize flag and quantize api parameter (#4321)
* rename `--quantization` to `--quantize`

* backwards

* Update api/types.go

Co-authored-by: Michael Yang <mxyng@pm.me>

---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2024-05-10 13:06:13 -07:00
Tobias Gårdhus
06ac829e70
Fix help string for stop parameter (#2307) 2024-05-07 16:48:35 -07:00
Jeffrey Morgan
39d9d22ca3
close server on receiving signal (#4213) 2024-05-06 16:01:37 -07:00
Michael Yang
b7a87a22b6
Merge pull request #4059 from ollama/mxyng/parser-2
rename parser to model/file
2024-05-03 13:01:22 -07:00
Michael Yang
e9ae607ece
Merge pull request #3892 from ollama/mxyng/parser
refactor modelfile parser
2024-05-02 17:04:47 -07:00
Bryce Reitano
bf4fc25f7b
Add a /clear command (#3947)
* Add a /clear command

* change help messages

---------

Co-authored-by: Patrick Devine <patrick@infrahq.com>
2024-05-01 17:44:36 -04:00
Michael Yang
45b6a12e45 server: target invalid 2024-05-01 12:40:45 -07:00
Michael Yang
119589fcb3 rename parser to model/file 2024-05-01 09:53:50 -07:00
Michael Yang
5ea844964e cmd: import regexp 2024-05-01 09:53:45 -07:00
Michael Yang
176ad3aa6e parser: add commands format 2024-05-01 09:52:54 -07:00
Bruce MacDonald
0a7fdbe533
prompt to display and add local ollama keys to account (#3717)
- return descriptive error messages when unauthorized to create blob or push a model
- display the local public key associated with the request that was denied
2024-04-30 11:02:08 -07:00
Patrick Devine
9009bedf13
better checking for OLLAMA_HOST variable (#3661) 2024-04-29 19:14:07 -04:00
Michael Yang
41e03ede95 check file type before zip 2024-04-26 14:18:07 -07:00
Michael Yang
ac0801eced only replace if it matches command 2024-04-24 14:49:26 -07:00
Michael Yang
ad66e5b060 split temp zip files 2024-04-24 14:18:01 -07:00