Commit graph

16 commits

Author SHA1 Message Date
Daniel Hiltgen
6f351bf586 review comments and coverage 2024-06-14 14:55:50 -07:00
Daniel Hiltgen
68dfc6236a refined test timing
adjust timing on some tests so they don't timeout on small/slow GPUs
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
6fd04ca922 Improve multi-gpu handling at the limit
Still not complete, needs some refinement to our prediction to understand the
discrete GPUs available space so we can see how many layers fit in each one
since we can't split one layer across multiple GPUs we can't treat free space
as one logical block
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
206797bda4 Fix concurrency integration test to work locally
This worked remotely but wound up trying to spawn multiple servers
locally which doesn't work
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
7f2fbad736 Skip max queue test on remote
This test needs to be able to adjust the queue size down from
our default setting for a reliable test, so it needs to skip on
remote test execution mode.
2024-05-16 16:24:18 -07:00
Daniel Hiltgen
074dc3b9d8 Integration fixes 2024-05-10 14:20:10 -07:00
Michael Yang
a7248f6ea8 update tests 2024-05-06 15:24:01 -07:00
Daniel Hiltgen
45d61aaaa3 Add integration test to push max queue limits 2024-05-05 10:46:25 -07:00
Daniel Hiltgen
f2ea8470e5 Local unicode test case 2024-04-22 19:29:12 -07:00
Daniel Hiltgen
34b9db5afc Request and model concurrency
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
Daniel Hiltgen
aeb1fb5192 Add test case for context exhaustion
Confirmed this fails on 0.1.30 with known regression
but passes on main
2024-04-04 07:42:17 -07:00
Jeffrey Morgan
cd135317d2
Fix macOS builds on older SDKs (#3467) 2024-04-03 10:45:54 -07:00
Daniel Hiltgen
4fec5816d6 Integration test improvements
Cleaner shutdown logic, a bit of response hardening
2024-04-01 16:48:18 -07:00
Patrick Devine
1b272d5bcd
change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347) 2024-03-26 13:04:17 -07:00
Daniel Hiltgen
7b6cbc10ec Integration tests conditionally pull
If images aren't present, pull them.
Also fixes the expected responses
2024-03-25 08:57:45 -07:00
Daniel Hiltgen
949b6c01e0 Revamp go based integration tests
This uplevels the integration tests to run the server which can allow
testing an existing server, or a remote server.
2024-03-23 14:24:18 +01:00