Josh
9eed4a90ce
Merge pull request #4411 from joshyan1/main
...
removed inconsistent punctuation
2024-05-13 15:30:45 -07:00
Josh Yan
f8464785a6
removed inconsistencies
2024-05-13 14:50:52 -07:00
Michael Yang
1d359e737e
typo
2024-05-13 14:18:34 -07:00
Michael Yang
50b9056e09
count memory up to NumGPU
2024-05-13 14:13:10 -07:00
Josh Yan
91a090a485
removed inconsistent punctuation
2024-05-13 14:08:22 -07:00
睡觉型学渣
9c76b30d72
Correct typos. ( #4387 )
...
* Correct typos.
* Correct typos.
2024-05-12 18:21:11 -07:00
Zander Lewis
93f19910c5
Update LlamaScript
to point to new link.
...
Still used Legacy link.
2024-05-12 11:24:21 -04:00
jmorganca
4ec7445a6f
Revert "use post token"
...
This reverts commit 0fec3525ad
.
2024-05-11 22:19:14 -07:00
Michael Yang
0372c51f82
Merge pull request #4369 from ollama/mxyng/post-token
...
use post token
2024-05-11 19:29:14 -07:00
Michael Yang
0fec3525ad
use post token
2024-05-11 19:13:16 -07:00
Jeffrey Morgan
41ba3017fd
Fix OpenAI finish_reason
values when empty ( #4368 )
2024-05-11 15:31:41 -07:00
todashuta
8080fbce35
fix ollama create
's usage string ( #4362 )
2024-05-11 14:47:49 -07:00
Michael Yang
ec14f6ceda
case sensitive filepaths ( #4366 )
2024-05-11 14:12:36 -07:00
Daniel Hiltgen
c60a086635
Merge pull request #4331 from dhiltgen/fix_unit
...
Fix envconfig unit test
2024-05-11 09:16:28 -07:00
jmorganca
92ca2cca95
Revert "only forward some env vars"
...
This reverts commit ce3b212d12
.
2024-05-10 22:53:21 -07:00
Patrick Devine
1e1634daca
update go deps ( #4324 )
2024-05-10 21:39:27 -07:00
Daniel Hiltgen
824ee5446f
Fix envconfig unit test
2024-05-10 16:49:48 -07:00
Daniel Hiltgen
879e2caf8c
Merge pull request #4329 from dhiltgen/zero_layers
...
Fall back to CPU runner with zero layers
2024-05-10 15:23:16 -07:00
Daniel Hiltgen
c4014e73a2
Fall back to CPU runner with zero layers
2024-05-10 15:09:48 -07:00
Daniel Hiltgen
be9efdb981
Merge pull request #4326 from dhiltgen/fix_integration
...
Integration fixes
2024-05-10 14:25:59 -07:00
Daniel Hiltgen
074dc3b9d8
Integration fixes
2024-05-10 14:20:10 -07:00
Daniel Hiltgen
86f9b582d5
Merge pull request #4323 from dhiltgen/sort_by_free
...
Always use the sorted list of GPUs
2024-05-10 14:12:15 -07:00
Daniel Hiltgen
4142c3ef7c
Always use the sorted list of GPUs
...
Make sure the first GPU has the most free space
2024-05-10 13:53:21 -07:00
Jeffrey Morgan
6602e793c0
Use --quantize
flag and quantize
api parameter ( #4321 )
...
* rename `--quantization` to `--quantize`
* backwards
* Update api/types.go
Co-authored-by: Michael Yang <mxyng@pm.me>
---------
Co-authored-by: Michael Yang <mxyng@pm.me>
2024-05-10 13:06:13 -07:00
Michael Yang
ea0fdaed28
Merge pull request #4320 from ollama/mxyng/phi2-mem
...
add phi2 mem
2024-05-10 12:35:08 -07:00
Michael Yang
1eb382da5a
add phi2 mem
2024-05-10 12:13:28 -07:00
Jeffrey Morgan
bb6fd02298
Don't clamp ctx size in PredictServerFit
( #4317 )
...
* dont clamp ctx size in `PredictServerFit`
* minimum 4 context
* remove context warning
2024-05-10 10:17:12 -07:00
Daniel Hiltgen
7e2bceceee
Merge pull request #4316 from dhiltgen/more_buffer
...
Bump VRAM buffer back up
2024-05-10 10:02:34 -07:00
Daniel Hiltgen
30a7d7096c
Bump VRAM buffer back up
...
Under stress scenarios we're seeing OOMs so this should help stabilize
the allocations under heavy concurrency stress.
2024-05-10 09:15:28 -07:00
Michael Yang
200a18820e
Merge pull request #4306 from ollama/mxyng/fix-routes
2024-05-10 08:58:16 -07:00
Michael Yang
e03637176d
fix(routes): skip bad manifests
2024-05-10 08:46:11 -07:00
Bruce MacDonald
c02db93243
omit empty done reason
2024-05-09 16:45:29 -07:00
Michael Yang
ffa4d5134a
Merge pull request #4305 from ollama/mxyng/typo
...
fix typo
2024-05-09 16:42:09 -07:00
Jeffrey Morgan
302d7fdbf3
prune partial downloads ( #4272 )
2024-05-09 16:35:20 -07:00
Michael Yang
cf442cd57e
fix typo
2024-05-09 16:23:37 -07:00
Michael Yang
0e1ba65855
Merge pull request #4302 from ollama/mxyng/forward-env
...
only forward some env vars
2024-05-09 16:21:05 -07:00
Michael Yang
6aad333c63
Merge pull request #4298 from ollama/mxyng/log-cleanup
...
log clean up
2024-05-09 16:20:57 -07:00
Daniel Hiltgen
4fcc84e67a
Merge pull request #4304 from dhiltgen/signals
...
Fix race in shutdown logic
2024-05-09 15:58:44 -07:00
Daniel Hiltgen
3ae2f441e0
Fix race in shutdown logic
...
Ensure the runners are terminated
2024-05-09 15:54:02 -07:00
Zander Lewis
2abb3f6424
Update README.md ( #4300 )
2024-05-09 15:30:49 -07:00
Michael Yang
ce3b212d12
only forward some env vars
2024-05-09 15:16:09 -07:00
Daniel Hiltgen
83d6d46e29
Merge pull request #4299 from dhiltgen/handle_vram_reporting_lag
...
Wait for GPU free memory reporting to converge
2024-05-09 15:08:56 -07:00
Daniel Hiltgen
354ad9254e
Wait for GPU free memory reporting to converge
...
The GPU drivers take a while to update their free memory reporting, so we need
to wait until the values converge with what we're expecting before proceeding
to start another runner in order to get an accurate picture.
2024-05-09 14:56:01 -07:00
Michael Yang
58876091f7
log clean up
2024-05-09 14:55:36 -07:00
Daniel Hiltgen
dc18eee39d
Merge pull request #4238 from dhiltgen/gpu_info
...
Record more GPU information
2024-05-09 14:26:58 -07:00
Daniel Hiltgen
8727a9c140
Record more GPU information
...
This cleans up the logging for GPU discovery a bit, and can
serve as a foundation to report GPU information in a future UX.
2024-05-09 14:18:14 -07:00
Daniel Hiltgen
d0425f26cf
Merge pull request #4294 from dhiltgen/harden_subprocess_reaping
...
Harden subprocess reaping
2024-05-09 14:02:16 -07:00
Bruce MacDonald
cfa84b8470
add done_reason to the api ( #4235 )
2024-05-09 13:30:14 -07:00
Michael Yang
1580ed4c06
Merge pull request #4295 from ollama/mxyng/fix-list
...
routes: skip invalid filepaths
2024-05-09 11:37:34 -07:00
Michael Yang
a7ee84fc31
routes: skip invalid filepaths
2024-05-09 11:23:22 -07:00