ollama/api
Daniel Hiltgen 171796791f Adjust mmap logic for cuda windows for faster model load
On Windows, recent llama.cpp changes make mmap slower in most
cases, so default to off.  This also implements a tri-state for
use_mmap so we can detect the difference between a user provided
value of true/false, or unspecified.
2024-06-17 16:54:30 -07:00
..
client.go move OLLAMA_HOST to envconfig (#5009) 2024-06-12 18:48:16 -04:00
client_test.go move OLLAMA_HOST to envconfig (#5009) 2024-06-12 18:48:16 -04:00
types.go Adjust mmap logic for cuda windows for faster model load 2024-06-17 16:54:30 -07:00
types_test.go Adjust mmap logic for cuda windows for faster model load 2024-06-17 16:54:30 -07:00