From 74d2a9ef9aa6a4ee31f027926f3985c9e1610346 Mon Sep 17 00:00:00 2001 From: Patrick Devine Date: Tue, 23 Apr 2024 21:06:51 -0700 Subject: [PATCH] add OLLAMA_KEEP_ALIVE env variable to FAQ (#3865) --- docs/faq.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/faq.md b/docs/faq.md index 6bd1b340..7ade43b7 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -228,3 +228,7 @@ To unload the model and free up memory use: ```shell curl http://localhost:11434/api/generate -d '{"model": "llama2", "keep_alive": 0}' ``` + +Alternatively, you can change the amount of time all models are loaded into memory by setting the `OLLAMA_KEEP_ALIVE` environment variable when starting the Ollama server. The `OLLAMA_KEEP_ALIVE` variable uses the same parameter types as the `keep_alive` parameter types mentioned above. Refer to section explaining [how to configure the Ollama server](#how-do-i-configure-ollama-server) to correctly set the environment variable. + +If you wish to override the `OLLAMA_KEEP_ALIVE` setting, use the `keep_alive` API parameter with the `/api/generate` or `/api/chat` API endpoints.