ollama/server
Jesse Gross 6cd566872b sched: Lift parallel restriction for multimodal models except mllama
The Go runner does not have a problem with supporting parallel
requests for most multimodal models. Now that we won't be potentially
falling back to server.cpp, this restriction can be lifted.

However, the new mllama model can't support parallel requests, so we
will need to keep a restriction for that.
2024-11-06 13:32:18 -08:00
..
imageproc add more tests for getting the optimal tiled canvas (#7411) 2024-10-29 16:28:02 -07:00
testdata/tools server: add tool parsing support for nemotron-mini (#6849) 2024-09-17 18:06:16 -07:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
download.go server: fix blob download when receiving a 200 response (#6656) 2024-09-05 10:48:26 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go image processing for llama3.2 (#6963) 2024-10-18 16:12:35 -07:00
model_test.go server: add tool parsing support for nemotron-mini (#6849) 2024-09-17 18:06:16 -07:00
modelpath.go validate model path 2024-08-28 09:32:57 -07:00
modelpath_test.go validate model path 2024-08-28 09:32:57 -07:00
prompt.go prompt: Use a single token when estimating mllama context size 2024-11-05 10:11:50 -08:00
prompt_test.go runner.go: Better abstract vision model integration 2024-10-30 14:53:43 -07:00
routes.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
routes_create_test.go Merge pull request #6534 from ollama/mxyng/messages 2024-08-30 09:39:59 -07:00
routes_delete_test.go server: clean up route names for consistency (#6524) 2024-08-26 19:36:11 -07:00
routes_generate_test.go image processing for llama3.2 (#6963) 2024-10-18 16:12:35 -07:00
routes_list_test.go server: clean up route names for consistency (#6524) 2024-08-26 19:36:11 -07:00
routes_test.go image processing for llama3.2 (#6963) 2024-10-18 16:12:35 -07:00
sched.go sched: Lift parallel restriction for multimodal models except mllama 2024-11-06 13:32:18 -08:00
sched_test.go Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: limit upload parts to 16 (#6411) 2024-08-19 09:20:52 -07:00