Commit graph

3183 commits

Author SHA1 Message Date
Patrick Devine
f2cf97d6f1
fix typo in modelfile generation (#4439) 2024-05-14 15:34:29 -07:00
Patrick Devine
c344da4c5a
fix keepalive for non-interactive mode (#4438) 2024-05-14 15:17:04 -07:00
Michael Yang
85a57006d1 check if name exists before create/pull/copy 2024-05-14 14:58:58 -07:00
Michael Yang
c5e892cb3e update tests 2024-05-14 14:56:31 -07:00
Michael Yang
81fb06f530 more resilient Manifests 2024-05-14 14:08:24 -07:00
Michael Yang
a385382ff5 filepath.Join 2024-05-14 14:08:24 -07:00
Michael Yang
b8772a353f remove DeleteModel 2024-05-14 14:08:24 -07:00
Michael Yang
c2714fcbfd routes: use Manifests for ListHandler 2024-05-14 14:08:24 -07:00
Michael Yang
a2fc933fed update delete handler to use model.Name 2024-05-14 14:08:24 -07:00
Michael Yang
0e331c7168
Merge pull request #4328 from ollama/mxyng/mem
count memory up to NumGPU if set by user
2024-05-14 13:47:44 -07:00
Michael Yang
ac145f75ca return on part done 2024-05-14 13:04:30 -07:00
Patrick Devine
a4b8d1f89a
re-add system context (#4435) 2024-05-14 11:38:20 -07:00
Ryo Machida
798b107f19
Fixed the API endpoint /api/tags when the model list is empty. (#4424)
* Fixed the API endpoint /api/tags to return {models: []} instead of {models: null} when the model list is empty.

* Update server/routes.go

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-14 11:18:10 -07:00
Daniel Hiltgen
6a1b471365
Merge pull request #4430 from dhiltgen/gpu_info
Remove VRAM convergence check for windows
2024-05-14 10:59:06 -07:00
Daniel Hiltgen
ec231a7923 Remove VRAM convergence check for windows
The APIs we query are optimistic on free space, and windows pages
VRAM, so we don't have to wait to see reported usage recover on unload
2024-05-14 09:53:46 -07:00
Patrick Devine
7ca71a6b0f
don't abort when an invalid model name is used in /save (#4416) 2024-05-13 18:48:28 -07:00
Josh
7607e6e902
Merge pull request #4379 from WolfTheDeveloper/main
Update `LlamaScript` to point to new link from Legacy link.
2024-05-13 18:08:32 -07:00
Patrick Devine
f1548ef62d
update the FAQ to be more clear about windows env variables (#4415) 2024-05-13 18:01:13 -07:00
Patrick Devine
6845988807
Ollama ps command for showing currently loaded models (#4327) 2024-05-13 17:17:36 -07:00
Josh
9eed4a90ce
Merge pull request #4411 from joshyan1/main
removed inconsistent punctuation
2024-05-13 15:30:45 -07:00
Josh Yan
f8464785a6 removed inconsistencies 2024-05-13 14:50:52 -07:00
Michael Yang
1d359e737e typo 2024-05-13 14:18:34 -07:00
Michael Yang
50b9056e09 count memory up to NumGPU 2024-05-13 14:13:10 -07:00
Josh Yan
91a090a485 removed inconsistent punctuation 2024-05-13 14:08:22 -07:00
睡觉型学渣
9c76b30d72
Correct typos. (#4387)
* Correct typos.

* Correct typos.
2024-05-12 18:21:11 -07:00
Zander Lewis
93f19910c5
Update LlamaScript to point to new link.
Still used Legacy link.
2024-05-12 11:24:21 -04:00
jmorganca
4ec7445a6f Revert "use post token"
This reverts commit 0fec3525ad.
2024-05-11 22:19:14 -07:00
Michael Yang
0372c51f82
Merge pull request #4369 from ollama/mxyng/post-token
use post token
2024-05-11 19:29:14 -07:00
Michael Yang
0fec3525ad use post token 2024-05-11 19:13:16 -07:00
Jeffrey Morgan
41ba3017fd
Fix OpenAI finish_reason values when empty (#4368) 2024-05-11 15:31:41 -07:00
todashuta
8080fbce35
fix ollama create's usage string (#4362) 2024-05-11 14:47:49 -07:00
Michael Yang
ec14f6ceda
case sensitive filepaths (#4366) 2024-05-11 14:12:36 -07:00
Daniel Hiltgen
c60a086635
Merge pull request #4331 from dhiltgen/fix_unit
Fix envconfig unit test
2024-05-11 09:16:28 -07:00
jmorganca
92ca2cca95 Revert "only forward some env vars"
This reverts commit ce3b212d12.
2024-05-10 22:53:21 -07:00
Patrick Devine
1e1634daca
update go deps (#4324) 2024-05-10 21:39:27 -07:00
Daniel Hiltgen
824ee5446f Fix envconfig unit test 2024-05-10 16:49:48 -07:00
Daniel Hiltgen
879e2caf8c
Merge pull request #4329 from dhiltgen/zero_layers
Fall back to CPU runner with zero layers
2024-05-10 15:23:16 -07:00
Daniel Hiltgen
c4014e73a2 Fall back to CPU runner with zero layers 2024-05-10 15:09:48 -07:00
Daniel Hiltgen
be9efdb981
Merge pull request #4326 from dhiltgen/fix_integration
Integration fixes
2024-05-10 14:25:59 -07:00
Daniel Hiltgen
074dc3b9d8 Integration fixes 2024-05-10 14:20:10 -07:00
Daniel Hiltgen
86f9b582d5
Merge pull request #4323 from dhiltgen/sort_by_free
Always use the sorted list of GPUs
2024-05-10 14:12:15 -07:00
Daniel Hiltgen
4142c3ef7c Always use the sorted list of GPUs
Make sure the first GPU has the most free space
2024-05-10 13:53:21 -07:00
Jeffrey Morgan
6602e793c0
Use --quantize flag and quantize api parameter (#4321)
* rename `--quantization` to `--quantize`

* backwards

* Update api/types.go

Co-authored-by: Michael Yang <mxyng@pm.me>

---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2024-05-10 13:06:13 -07:00
Michael Yang
ea0fdaed28
Merge pull request #4320 from ollama/mxyng/phi2-mem
add phi2 mem
2024-05-10 12:35:08 -07:00
Michael Yang
1eb382da5a add phi2 mem 2024-05-10 12:13:28 -07:00
Jeffrey Morgan
bb6fd02298
Don't clamp ctx size in PredictServerFit (#4317)
* dont clamp ctx size in `PredictServerFit`

* minimum 4 context

* remove context warning
2024-05-10 10:17:12 -07:00
Daniel Hiltgen
7e2bceceee
Merge pull request #4316 from dhiltgen/more_buffer
Bump VRAM buffer back up
2024-05-10 10:02:34 -07:00
Daniel Hiltgen
30a7d7096c Bump VRAM buffer back up
Under stress scenarios we're seeing OOMs so this should help stabilize
the allocations under heavy concurrency stress.
2024-05-10 09:15:28 -07:00
Michael Yang
200a18820e
Merge pull request #4306 from ollama/mxyng/fix-routes 2024-05-10 08:58:16 -07:00
Michael Yang
e03637176d fix(routes): skip bad manifests 2024-05-10 08:46:11 -07:00