Daniel Hiltgen
|
026869915f
|
Merge pull request #4144 from dhiltgen/max_queue
Make maximum pending request configurable
|
2024-05-05 10:53:44 -07:00 |
|
Daniel Hiltgen
|
45d61aaaa3
|
Add integration test to push max queue limits
|
2024-05-05 10:46:25 -07:00 |
|
Daniel Hiltgen
|
20f6c06569
|
Make maximum pending request configurable
This also bumps up the default to be 50 queued requests
instead of 10.
|
2024-05-04 21:00:52 -07:00 |
|
Daniel Hiltgen
|
371f5e52aa
|
Merge pull request #4141 from dhiltgen/win_docs
Explain the 2 different windows download options
|
2024-05-04 12:50:16 -07:00 |
|
Daniel Hiltgen
|
e006480e49
|
Explain the 2 different windows download options
|
2024-05-04 12:50:05 -07:00 |
|
Michael Yang
|
aed545872d
|
Merge pull request #4143 from ollama/mxyng/final-response
omit prompt and generate settings from final response
|
2024-05-03 17:39:49 -07:00 |
|
Michael Yang
|
44869c59d6
|
omit prompt and generate settings from final response
|
2024-05-03 17:00:02 -07:00 |
|
Daniel Hiltgen
|
52663284cf
|
Merge pull request #4145 from dhiltgen/fix_lint
Fix lint warnings
|
2024-05-03 16:53:17 -07:00 |
|
Daniel Hiltgen
|
42fa9d7f0a
|
Fix lint warnings
|
2024-05-03 16:44:19 -07:00 |
|
Michael Yang
|
b7a87a22b6
|
Merge pull request #4059 from ollama/mxyng/parser-2
rename parser to model/file
|
2024-05-03 13:01:22 -07:00 |
|
Dr Nic Williams
|
e8aaea030e
|
Update 'llama2' -> 'llama3' in most places (#4116)
* Update 'llama2' -> 'llama3' in most places
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
|
2024-05-03 15:25:04 -04:00 |
|
Daniel Hiltgen
|
b1ad3a43cb
|
Skip PhysX cudart library
For some reason this library gives incorrect GPU information, so skip it
|
2024-05-03 11:55:32 -07:00 |
|
Daniel Hiltgen
|
267e25a750
|
Merge pull request #4129 from dhiltgen/unit_tests
Soften timeouts on sched unit tests
|
2024-05-03 11:10:26 -07:00 |
|
Daniel Hiltgen
|
9a32c514cb
|
Soften timeouts on sched unit tests
This gives us more headroom on the scheduler tests to tamp
down some flakes.
|
2024-05-03 09:08:33 -07:00 |
|
Michael Yang
|
e9ae607ece
|
Merge pull request #3892 from ollama/mxyng/parser
refactor modelfile parser
|
2024-05-02 17:04:47 -07:00 |
|
Michael Yang
|
93707fa3f2
|
Merge pull request #4108 from ollama/mxyng/lf
fix line ending
|
2024-05-02 14:55:15 -07:00 |
|
Michael Yang
|
94c369095f
|
fix line ending
replace CRLF with LF
|
2024-05-02 14:53:13 -07:00 |
|
Jeffrey Morgan
|
9164b0161b
|
Update .gitattributes
|
2024-05-02 14:06:31 -04:00 |
|
Daniel Hiltgen
|
e592e8fccb
|
Support Fedoras standard ROCm location
|
2024-05-01 15:47:12 -07:00 |
|
Bryce Reitano
|
bf4fc25f7b
|
Add a /clear command (#3947)
* Add a /clear command
* change help messages
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
|
2024-05-01 17:44:36 -04:00 |
|
Michael Yang
|
5b806d8d24
|
Merge pull request #4089 from ollama/mxyng/target-invalid
server: destination invalid
|
2024-05-01 12:46:35 -07:00 |
|
Michael Yang
|
cb1e072643
|
Merge pull request #4087 from ollama/mxyng/fix-host-port
types/model: fix name for hostport
|
2024-05-01 12:42:07 -07:00 |
|
Michael Yang
|
45b6a12e45
|
server: target invalid
|
2024-05-01 12:40:45 -07:00 |
|
alwqx
|
68755f1f5e
|
chore: fix typo in docs/development.md (#4073)
|
2024-05-01 15:39:11 -04:00 |
|
Michael Yang
|
997a455039
|
want filepath
|
2024-05-01 12:33:41 -07:00 |
|
Michael Yang
|
88775e1ff9
|
strip scheme from name
|
2024-05-01 12:26:19 -07:00 |
|
Michael Yang
|
8867e744ff
|
types/model: fix name for hostport
|
2024-05-01 12:14:53 -07:00 |
|
Daniel Hiltgen
|
4fd064bea6
|
Merge pull request #4031 from MarkWard0110/fix/issue-3736
Fix/issue 3736: When runners are closing or expiring. Scheduler is getting dirty VRAM size readings.
|
2024-05-01 12:13:26 -07:00 |
|
Jeffrey Morgan
|
59fbceedcc
|
use lf for line endings (#4085)
|
2024-05-01 15:02:45 -04:00 |
|
Mark Ward
|
321d57e1a0
|
Removing go routine calling .wait from load.
|
2024-05-01 18:51:10 +00:00 |
|
Mark Ward
|
ba26c7aa00
|
it will always return an error due to Kill() discarding Wait() errors
|
2024-05-01 18:51:10 +00:00 |
|
Mark Ward
|
63c763685f
|
log when the waiting for the process to stop to help debug when other tasks execute during this wait.
expire timer clear the timer reference because it will not be reused.
close will clean up expireTimer if calling code has not already done this.
|
2024-05-01 18:51:10 +00:00 |
|
Mark Ward
|
34a4a94f13
|
ignore debug bin files
|
2024-05-01 18:51:10 +00:00 |
|
Mark Ward
|
f4a73d57a4
|
fix runner expire during active use. Clearing the expire timer as it is used. Allowing the finish to assign an expire timer so that the runner will expire after no use.
|
2024-05-01 18:51:10 +00:00 |
|
Mark Ward
|
948114e3e3
|
fix sched to wait for the runner to terminate to ensure following vram check will be more accurate
|
2024-05-01 18:51:10 +00:00 |
|
Arpit Jain
|
a3e60d9058
|
README.md: fix typos (#4007)
Co-authored-by: Blake Mizerany <blake.mizerany@gmail.com>
|
2024-05-01 10:39:38 -07:00 |
|
Michael Yang
|
8acb233668
|
use strings.Builder
|
2024-05-01 10:01:09 -07:00 |
|
Michael Yang
|
119589fcb3
|
rename parser to model/file
|
2024-05-01 09:53:50 -07:00 |
|
Michael Yang
|
5ea844964e
|
cmd: import regexp
|
2024-05-01 09:53:45 -07:00 |
|
Michael Yang
|
bd8eed57fc
|
fix parser name
|
2024-05-01 09:52:54 -07:00 |
|
Michael Yang
|
9cf0f2e973
|
use parser.Format instead of templating modelfile
|
2024-05-01 09:52:54 -07:00 |
|
Michael Yang
|
176ad3aa6e
|
parser: add commands format
|
2024-05-01 09:52:54 -07:00 |
|
Michael Yang
|
4d08363580
|
comments
|
2024-05-01 09:52:54 -07:00 |
|
Michael Yang
|
8907bf51d2
|
fix multiline
|
2024-05-01 09:52:54 -07:00 |
|
Michael Yang
|
abe614c705
|
tests
|
2024-05-01 09:52:54 -07:00 |
|
Michael Yang
|
238715037d
|
linting
|
2024-05-01 09:52:54 -07:00 |
|
Michael Yang
|
c0a00f68ae
|
refactor modelfile parser
|
2024-05-01 09:52:54 -07:00 |
|
Jeffrey Morgan
|
f0c454ab57
|
gpu: add 512MiB to darwin minimum, metal doesn't have partial offloading overhead (#4068)
|
2024-05-01 11:46:03 -04:00 |
|
Daniel Hiltgen
|
089daaeabc
|
Add CUDA Driver API for GPU discovery
We're seeing some corner cases with cudart which might be resolved by
switching to the driver API which comes bundled with the driver package
|
2024-04-30 18:00:45 -07:00 |
|
Blake Mizerany
|
b9f74ff3d6
|
types/model: reintroduce Digest (#4065)
|
2024-04-30 16:38:03 -07:00 |
|