ollama/server
Daniel Hiltgen 5e8ff556cb Support forced spreading for multi GPU
Our default behavior today is to try to fit into a single GPU if possible.
Some users would prefer the old behavior of always spreading across
multiple GPUs even if the model can fit into one.  This exposes that
tunable behavior.
2024-06-14 14:51:40 -07:00
..
auth.go Revert "use post token" 2024-05-11 22:19:14 -07:00
download.go server: skip blob verification for already verified blobs 2024-06-05 16:39:11 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go server: remove jwt decoding error (#5027) 2024-06-13 11:21:15 -07:00
layer.go Merge pull request #3718 from ollama/mxyng/modelname-3 2024-05-29 12:02:07 -07:00
manifest.go fix: skip removing layers that no longer exist 2024-06-10 11:32:19 -07:00
manifest_test.go add OLLAMA_MODELS to envconfig (#5029) 2024-06-13 12:52:03 -07:00
model.go fix: multiple templates when creating from model 2024-06-12 13:35:49 -07:00
modelpath.go add OLLAMA_MODELS to envconfig (#5029) 2024-06-13 12:52:03 -07:00
modelpath_test.go add OLLAMA_MODELS to envconfig (#5029) 2024-06-13 12:52:03 -07:00
prompt.go change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
prompt_test.go change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
routes.go API app/browser access (#4879) 2024-06-06 15:19:03 -07:00
routes_create_test.go add OLLAMA_MODELS to envconfig (#5029) 2024-06-13 12:52:03 -07:00
routes_delete_test.go add OLLAMA_MODELS to envconfig (#5029) 2024-06-13 12:52:03 -07:00
routes_list_test.go add OLLAMA_MODELS to envconfig (#5029) 2024-06-13 12:52:03 -07:00
routes_test.go add OLLAMA_MODELS to envconfig (#5029) 2024-06-13 12:52:03 -07:00
sched.go Support forced spreading for multi GPU 2024-06-14 14:51:40 -07:00
sched_test.go Improve multi-gpu handling at the limit 2024-06-14 14:51:40 -07:00
upload.go lint 2024-06-04 11:13:30 -07:00